Вы находитесь на странице: 1из 63

COMPUTER

ARCHITECTURE AND
ORGANIZATION
ECEG -3143
Guide Book

FACULTY OF TECHNOLOGY
Department of Electrical & Computer Engineering

0|Page
Contents
CHAPTER 01..............................................................................................................................................1
[FUNDAMENTAL CONCEPTS OF COMPUTER ORGANIZATION & ARCHITECTURE]........................................1
Chapter Description.................................................................................................................................1
1.1 Introduction.................................................................................................................................1
1.1.1. Organization and Architecture.............................................................................................1
1.1.2. Structure and Function..............................................................................................................2
1.2 Computer Evolution and Performance...............................................................................................4
1.2.1. Brief History of Computers........................................................................................................4
1.2.2. Measuring Performance............................................................................................................5
1.2.3. Performance Improvement Techniques.....................................................................................6
CHAPTER 02..............................................................................................................................................8
[A TOP LEVEL VIEW OF COMPUTER]............................................................................................................8
Chapter Description.................................................................................................................................8
2.1. Computer Components....................................................................................................................8
Program Concept.................................................................................................................................8
What is a program?..............................................................................................................................8
Function of Control Unit......................................................................................................................8
Components........................................................................................................................................8
Computer Components: Top Level View..............................................................................................8
Some basic registers inside CPU..........................................................................................................9
2.2. Computer Function...........................................................................................................................9
Instruction Cycle..................................................................................................................................9
Example of Program Execution..........................................................................................................10
Instruction Cycle State Diagram.........................................................................................................12
Interrupts...........................................................................................................................................12
Interrupts Cycle.................................................................................................................................12
Transfer of Control via Interrupts.......................................................................................................13
Instruction Cycle with Interrupts.......................................................................................................13
Instruction Cycle (with Interrupts) - State Diagram............................................................................13
2.3. Interconnection Structures.............................................................................................................14

1
Computer Modules............................................................................................................................14
Memory Connection..........................................................................................................................14
Input/Output Connection..................................................................................................................15
CPU Connection.................................................................................................................................15
2.4. Bus Interconnection........................................................................................................................15
What is a Bus?...................................................................................................................................15
Bus Interconnection Scheme.............................................................................................................16
Bus Types...........................................................................................................................................16
Bus Arbitration...................................................................................................................................17
Centralised or Distributed Arbitration...............................................................................................17
CHAPTER 03............................................................................................................................................18
[COMPUTER ARTHIMETICS AND NUMBERING SYSTEMS]..........................................................................18
Chapter Description...............................................................................................................................18
3.1. Arithmetic and Logic unit (ALU)......................................................................................................18
3.2. Integer Representation...................................................................................................................18
Sign-Magnitude.................................................................................................................................18
Twos Compliment.............................................................................................................................19
Conversion Between different bit Lengths.........................................................................................20
3.3. Integer Arithmetic...........................................................................................................................20
Addition and Subtraction...................................................................................................................20
Multiplication....................................................................................................................................21
Division..............................................................................................................................................24
3.4. Floating Point Representation.........................................................................................................26
Real Numbers....................................................................................................................................26
IEEE Standard for Binary Floating-Point Representation....................................................................28
3.5. Floating Point Arithmetic................................................................................................................28
FP Arithmetic +/-................................................................................................................................28
FP Arithmetic x/...............................................................................................................................29
CHAPTER 04............................................................................................................................................32
[INSTRUCTION SETS AND ADDRESSING MODES].......................................................................................32
Chapter Description...............................................................................................................................32
4.1 Instruction sets.................................................................................................................................32
4.1.1 Introduction..............................................................................................................................32

2
4.1.2 Instruction Format....................................................................................................................33
4.1.3 Instruction Types.......................................................................................................................34
4.2 Addressing modes............................................................................................................................38
4.2.1 Immediate addressing modes....................................................................................................39
4.2.2 Direct addressing modes...........................................................................................................39
4.2.3 Register addressing modes........................................................................................................39
4.2.4 Register indirect addressing modes...........................................................................................39
4.2.5 Displacement addressing modes...............................................................................................40
4.2.6 Stack addressing modes............................................................................................................40
x86 addressing modes.......................................................................................................................40
CHAPTER 05............................................................................................................................................42
[PROCESSOR ORGANIZATION & INSTRUCTION CYCLE]..............................................................................42
Chapter description...............................................................................................................................42
5.1 Processor organization.....................................................................................................................42
5.2 Register Organizations.....................................................................................................................43
Types of registers...............................................................................................................................43
5.3 Instruction cycle and Pipeline..........................................................................................................44
Instruction cycle.................................................................................................................................44
Instruction Pipelining.........................................................................................................................44
CHAPTER 06............................................................................................................................................49
[COMPUTER MEMORY].............................................................................................................................49
6.1 Computer Memory System Overview........................................................................................49
6.2 Cache Memory...........................................................................................................................50
CHAPTER 7..............................................................................................................................................53
[Input/output]............................................................................................................................................53
7.1 EXTERNAL DEVICES...................................................................................................................53
7.2 I/O MODULES...............................................................................................................................53
7.2.1 I /O steps...................................................................................................................................54
7.3 I /O techniques.................................................................................................................................54

3
Table of Figure

Figure 1.1 the Computer..............................................................................................................................3


Figure 1.2 the computer top-level structure................................................................................................4
Figure 2.1 Computer components: Top level view.......................................................................................9
Figure 2.2 Instruction cycle..........................................................................................................................9
Figure 2.3 Example of Program Execution (contents of memory and registers in hexadecimal)................11
Figure 2.4 Characteristics of a Hypothetical Machine................................................................................11
Figure 2.5 Instruction Cycle State Diagram................................................................................................12
Figure 2.6 Transfer of Control via Interrupts..............................................................................................13
Figure 2.7 Instruction Cycle with Interrupts...............................................................................................13
Figure 2.8 Instruction Cycle (with Interrupts) - State Diagram..................................................................13
Figure 2.8 Computer modules...................................................................................................................14
Figure 2.9 Bus Interconnection Scheme....................................................................................................16
Figure 3.1 ALU inputs and outputs............................................................................................................18
Figure 3.2 Hardware for Addition and subtraction....................................................................................21
Figure 3.3 Flowchart for unsigned binary multiplication...........................................................................22
Figure 3.4 Hardware Implementation of Unsigned Binary Multiplication..................................................23
Figure 3.5 Booths Algorithm for Twos Complement Multiplication..........................................................24
Figure 3.5 Flowchart for Unsigned Binary Division....................................................................................25
Figure 3.6 Expressible Numbers in Typical 32-Bit Formats.........................................................................27
Figure 3.7 FP addition and subtraction......................................................................................................29
Figure 4.1 Unconditional jump program sequence....................................................................................37
Figure 4.2 Conditional jump program sequence........................................................................................38

4
5
CHAPTER 01
[FUNDAMENTAL CONCEPTS OF COMPUTER
ORGANIZATION & ARCHITECTURE]
Chapter Description
Chapter 1 introduces the concept of the computer as a hierarchical system. A computer can be viewed as
a structure of components and its function described in terms of the collective function of its
cooperating components. Each component, in turn, can be described in terms of its internal structure
and function. The major levels of this hierarchical view are introduced.

The chapter also discusses about the history of computers, Measuring Computer performance and
techniques used to improve computer performance.

1.1 Introduction
Brain storming

What is a Computer? List some of the computers you know?

This course is about the structure and function of computers. Its purpose is to present, as clearly and
completely as possible, the nature and characteristics of modern-day computers. This task is a
challenging one for two reasons.

First, there are various devices that are considered as computers. These devices (computers)
exhibit variety in cost, size, performance, application.
Second, the rapid pace of change that has always characterized computer technology continues
with no letup.

In spite of the variety and pace of change in the computer field, certain fundamental concepts apply
consistently throughout.

The intent of this course is to provide a complete discussion of the fundamentals of computer
organization and architecture and to relate these to contemporary computer design issues.

1.1.1. Organization and Architecture


Computer architecture refers to those attributes of a system visible to a programmer or, put another
way, those attributes that have a direct impact on the logical execution of a program.

Computer organization refers to the operational units and their interconnections that realize the
architectural specifications.

Examples of architectural attributes include

the instruction set,


the number of bits used to represent various data types (e.g., numbers, characters),
I/O mechanisms, and
Techniques for addressing memory.

1|Page
Organizational attributes include those hardware details transparent to the programmer, such as

control signals,
interfaces between the computer and peripherals, and
The memory technology used.

Historically, and still today, the distinction between architecture and organization has been an important
one.

This Course examines both computer organization and computer architecture. The emphasis is perhaps
more on the side of organization.

1.1.2. Structure and Function


Most Complex systems including the computer have hierarchical nature. A hierarchical system is a set of
interrelated subsystems, each of the latter, in turn, hierarchical in structure until we reach some lowest
level of elementary subsystem.

The hierarchical nature of complex systems is essential to both their design and their description. The
designer need only deal with a particular level of the system at a time. At each level, the system consists
of a set of components and their interrelationships. At each level, the designer is concerned with
structure and function:

Structure: The way in which the components are interrelated


Function: The operation of each individual component as part of the structure

The computer system will be described from the top down. We begin with the major components of a
computer, describing their structure and function, and proceed to successively lower layers of the
hierarchy.

Function
In general terms, there are only four basic functions that a computer can perform:

Data processing
Data storage
Data movement
Control

Structure
Figure 1.1 is the simplest possible depiction of a computer.

2
Figure 1.1 the Computer

But of greater concern in this course is the internal structure of the computer itself, which is shown in
Figure 1.2.

There are four main structural components:

The Central processing point (CPU)


I/O
Main Memory
System interconnection

Each of these components will be examined in some detail in other chapters.

The most interesting and in some ways the most complex component is the CPU. Its major structural
components are as follows:

Control Unit
Arithmetic and Logic Unit (ALU)
Registers
CPU interconnection

Each of these components will be also examined in some detail in chapter five Processor structure and
function.

Finally, there are several approaches to the implementation of the control unit, one common approach is
a microprogrammed implementation. With this approach, the structure of the control unit can be
depicted, as in Figure 1.2.This structure will be examined in Chapter 8.

3
Figure 1.2 the computer top-level structure

1.2 Computer Evolution and Performance


1.2.1. Brief History of Computers
Generation in computer terminology is a change in technology a computer is/was being used. Initially,
the generation term was used to distinguish between varying hardware technologies. But now a days,
generation includes both hardware and software, which together make up an entire Computer system.

There are totally five computer generations known till date. Each generation has been discussed in detail
along with their time period and characteristics. Here approximate dates against each generations have
been mentioned which are normally accepted.

Following are the main five generations of computers

First Generation

The period of first generation: 1946-1959. Vacuum tube based.

Second Generation

The period of second generation: 1959-1965. Transistor based.

Third Generations

The period of third generation: 1965-1971. Integrated Circuit based.

4
Fourth Generations

The period of fourth generation: 1971-1980. VLSI microprocessor based.

Fifth Generation

The period of fifth generation: 1980-onwards. ULSI microprocessor based

1.2.2. Measuring Performance


All computers have a clock to determine when events take place.

One period of this clock is called clock cycle time

Average number of clock cycles per instruction for a program is called clock cycle per instruction (CPI).

CPU execution time: Total time a CPU spends computing on a given task (excludes time for I/O or
running other programs). This is also referred to as simply CPU time.

Instruction count: number of instructions executed by a program

Less CPU time => Better performance

Q. Which processor has better performance, P1 or P2?

Response time: Total time to complete a task, including time spent executing on the CPU, accessing disk
and memory, waiting for I/O and other processes, and operating system overhead.

Throughput (Bandwidth): Number of tasks completed per unit time

To improve performance:

5
Reduce response time

Increase throughput

1.2.3. Performance Improvement Techniques


Obvious solution: Increase clock rate

The clock rate is the inverse of the clock cycle time.

(Increasing clock rate => reducing response time=>improved performance)

Performance can be improved by improving response time and/or throughput

Techniques that improve response time

Increasing clock rate

Cache

Techniques that improve throughput

Instruction-level parallelism (pipelining)

Multiple cores

Pipelining:

Processor fetch, decode, execute and write instructions at same time.

only improves throughput

Fetch Unit gets the next instruction from the cache.

6
Decode Unit determines type of instruction.

Instruction and data sent to Execution Unit.

Write Unit stores result.

Multiple core:

Modern microprocessors contain multiple processors (cores) on a single chip

7
CHAPTER 02
[A TOP LEVEL VIEW OF COMPUTER]
Chapter Description
This Chapter provides a brief examination of the computers components and their input-output
requirements. And it looks at key issues that affect interconnection design, especially the need to
support interrupts.

2.1. Computer Components


Program Concept
Hardwired systems are inflexible

General purpose hardware can do different tasks, given correct control signals

Instead of re-wiring, supply a new set of control signals

What is a program?
A sequence of steps

For each step, an arithmetic or logical operation is done

For each operation, a different set of control signals is needed

Function of Control Unit


For each operation a unique code is provided

e.g. ADD, MOVE

A hardware segment accepts the code and issues the control signals

Components
The Control Unit and the Arithmetic and Logic Unit constitute the Central Processing Unit

Data and instructions need to get into the system and results out

Input/output

Temporary storage of code and results is needed

Main memory

Computer Components: Top Level View

8
Figure 2.1 Computer components: Top level view

Some basic registers inside CPU


Program Counter (PC): Holds address of next instruction to fetch.
Instruction Register (IR): Temporarily holds fetched instruction while it is being read (decoded)
by CPU.
Memory Address Register (MAR): Specifies the address in memory of the word to be written
from or read into the MBR.
Memory Buffer (data) register (MBR): Contains a word to be stored in memory or is used to
receive a word from memory.
Input/output address register (I/O AR): specifies a particular I/O device.
Input/output buffer register (I/O BR): used for the exchange of data between an I/O module and
the CPU.

2.2. Computer Function


Instruction Cycle
Two steps:

Fetch

Execute

Figure 2.2 Instruction cycle

9
Fetch Cycle
Program Counter (PC) holds address of next instruction to fetch

Processor fetches instruction from memory location pointed to by PC

Increment PC

Unless told otherwise

Instruction loaded into Instruction Register (IR)

Processor interprets instruction and performs required actions

Execute Cycle

Processor-memory

data transfer between CPU and main memory

Processor I/O

Data transfer between CPU and I/O module

Data processing

Some arithmetic or logical operation on data

Control

Alteration of sequence of operations

e.g. jump

Combination of above

Example of Program Execution


Consider a simple example using a hypothetical machine that includes the characteristics listed in Figure
2.4.

Figure 2.3 illustrates a partial program execution, showing the relevant portions of memory and
processor registers. The program fragment shown adds the contents of the memory word at address 940
to the contents of the memory word at address 941 and stores the result in the latter location.

10
Figure 2.3 Example of Program Execution (contents of memory and registers in hexadecimal)

Figure 2.4 Characteristics of a Hypothetical Machine

11
Instruction Cycle State Diagram

Figure 2.5 Instruction Cycle State Diagram

Interrupts
Mechanism by which other modules (e.g. I/O) may interrupt normal sequence of processing

Program

e.g. overflow, division by zero

Timer

Generated by internal processor timer

Used in pre-emptive multi-tasking

I/O

from I/O controller

Hardware failure

e.g. memory parity error

Interrupts Cycle
Added to instruction cycle

Processor checks for interrupt

Indicated by an interrupt signal

If no interrupt, fetch next instruction

If interrupt pending:

Suspend execution of current program

Save context

Set PC to start address of interrupt handler routine

12
Process interrupt

Restore context and continue interrupted program

Transfer of Control via Interrupts

Figure 2.6 Transfer of Control via Interrupts

Instruction Cycle with Interrupts

Figure 2.7 Instruction Cycle with Interrupts

Instruction Cycle (with Interrupts) - State Diagram

Figure 2.8 Instruction Cycle (with Interrupts) - State Diagram

13
2.3. Interconnection Structures
A computer consists of a set of components or modules of three basic types (processor, memory,
I/O) that communicate with each other. In effect, a computer is a network of basic modules.
Thus, there must be paths for connecting the modules.
The collection of paths connecting the various modules is called the interconnection structure.
Different type of connection for different type of unit
o Memory
o Input/Output
o CPU

Computer Modules
Figure 2.8 suggests the types of exchanges that are needed by indicating the major forms of input and
output for each module type:

Figure 2.8 Computer modules

Memory Connection
Receives and sends data

Receives addresses (of locations)

Receives control signals

Read

Write

14
Input/Output Connection
Similar to memory from Proccessers viewpoint

Output

Receive data from computer

Send data to peripheral

Input

Receive data from peripheral

Send data to computer

Receive control signals from computer

Send control signals to peripherals

e.g. spin disk

Receive addresses from computer

e.g. port number to identify peripheral

Send interrupt signals

CPU Connection
Reads instruction and data

Writes out data (after processing)

Sends control signals to other units

Receives (& acts on) interrupts

2.4. Bus Interconnection


What is a Bus?
A shared communication pathway connecting two or more devices

Usually broadcast

Often grouped

A number of channels in one bus

e.g. 32 bit data bus is 32 separate single bit channels

Power lines may not be shown

Data Bus
Carries data

15
Remember that there is no difference between data and instruction at this level

Width is a key determinant of performance

8, 16, 32, 64 bit

Address bus
Identify the source or destination of data

e.g. CPU needs to read an instruction (data) from a given location in memory

Bus width determines maximum memory capacity of system

e.g. 8080 has 16 bit address bus giving 64k address space

Control Bus
Control and timing information

Memory read/write signal

I/O read/write signal

Bus request/grant

Interrupt request

Clock signals

Bus Interconnection Scheme

Figure 2.9 Bus Interconnection Scheme

Bus Types
Dedicated

Separate data & address lines

Multiplexed

Shared lines

Address valid or data valid control line

Advantage - fewer lines

16
Disadvantages

More complex control

Reduction performance

Bus Arbitration
More than one module controlling the bus

e.g. CPU and DMA controller

Only one module may control bus at one time

Arbitration may be centralised or distributed

Centralised or Distributed Arbitration


Centralised

Single hardware device controlling bus access

Bus Controller

Arbiter

May be part of CPU or separate

Distributed

Each module may claim the bus

Control logic on all modules

17
CHAPTER 03
[COMPUTER ARTHIMETICS AND NUMBERING SYSTEMS]
Chapter Description
This chapter examines the functionality of the arithmetic and logic unit (ALU) and focuses on the
representation of numbers and techniques for implementing arithmetic operations. Processors typically
support two types of arithmetic: integer, or fixed point, and floating point. For both cases, the chapter
first examines the representation of numbers and then discusses arithmetic operations.

3.1. Arithmetic and Logic unit (ALU)


Does the calculations

Everything else in the computer is there to service this unit

Handles integers

May handle floating point (real) numbers

Figure 3.1 ALU inputs and outputs

3.2. Integer Representation


Only have 0 & 1 to represent everything

Positive numbers stored in binary

e.g. 41=00101001

No minus sign

No period

Sign-Magnitude

Twos compliment

Sign-Magnitude
Left most bit is sign bit

18
0 means positive

1 means negative

+18 = 00010010

-18 = 10010010

Problems

Need to consider both sign and magnitude in arithmetic

Two representations of zero (+0 and -0)

Twos Compliment
+3 = 00000011

+2 = 00000010

+1 = 00000001

+0 = 00000000

-1 = 11111111

-2 = 11111110

-3 = 11111101

Benefits
One representation of zero

Arithmetic works easily (see later)

Negating is fairly easy

3 = 00000011

Boolean complement gives 11111100

Add 1 to LSB 11111101

Negation Special Case 1


0= 00000000

Bitwise not 11111111

Add 1 to LSB +1

Result 1 00000000

Overflow is ignored, so:

-0=0

19
Negation Special Case 2
-128 = 10000000

bitwise not 01111111

Add 1 to LSB +1

Result 10000000

So:

-(-128) = -128 X

Monitor MSB (sign bit)

It should change during negation

Range of Numbers
8 bit 2s compliment

+127 = 01111111 = 27 -1

-128 = 10000000 = -27

16 bit 2s compliment

+32767 = 011111111 11111111 = 215 - 1

-32768 = 100000000 00000000 = -215

Conversion Between different bit Lengths


Positive number pack with leading zeros

+18 = 00010010

+18 = 00000000 00010010

Negative numbers pack with leading ones

-18 = 11101110

-18 = 11111111 11101110

i.e. pack with MSB (sign bit)

3.3. Integer Arithmetic


Addition and Subtraction
Normal binary addition

Monitor sign bit for overflow

20
Take twos compliment of substahend and add to minuend

i.e. a - b = a + (-b)

So we only need addition and complement circuits

Overflow rule
If two numbers are added, and they are both positive or both negative, then overflow occurs if and only
if the result has the opposite sign.

Hardware for Addition and Subtraction

Figure 3.2 Hardware for Addition and subtraction

Multiplication
Complex

Work out partial product for each digit

Take care with place value (column)

Add partial products

Example:
1011 Multiplicand (11 dec)

x 1101 Multiplier (13 dec)

1011 Partial products

0000 Note: if multiplier bit is 1 copy

1011 multiplicand (place value)

21
1011 otherwise zero

10001111 Product (143 dec)

Note: need double length result

Flowchart for Unsigned Binary Multiplication

Figure 3.3 Flowchart for unsigned binary multiplication

Execution of Example
M multiplicand Q multiplier

22
Hardware implementation of Unsigned Binary Multiplication

Figure 3.4 Hardware Implementation of Unsigned Binary Multiplication

Multiplying negative numbers


Comparison of Multiplication of Unsigned and Twos Complement Integers

This does not work!


Solution 1
Convert to positive if required
Multiply as above
If signs were different, negate answer
Solution 2
Booths algorithm

23
Booths Algorithm

Figure 3.5 Booths Algorithm for Twos Complement Multiplication

Example of Booths Algorithm


Example of Booths Algorithm (7 3)

Division
More complex than multiplication

Negative numbers are really bad!

Division of Unsigned Binary Integers


An example of the long division of unsigned binary integers.

24
Flowchart for Unsigned Binary Division

Figure 3.5 Flowchart for Unsigned Binary Division

Twos complement division


The algorithm can be summarized as follows:

Load the divisor into the M register and the dividend into the A,Q registers.

Shift A,Q left one position.

If M and A have the same signs, perform A-M else perform A+M

The preceding operation is successful if the sign of A is the same as before, after the operation,
* if the operation is successful or A = 0, then set Q 0 = 1
* if the operation is unsuccessful and A is = not 0 then set Q 0 = 0 and restore the previous value of A.

Repeat steps 2 through 4 as many times as there are bit positions in Q.

The remainder is in A. If the signs of the divisor and dividend is the same, then the quotient is Q, else it
is the twos complement of Q.

25
3.4. Floating Point Representation
Real Numbers
Numbers with fractions

Could be done in pure binary

1001.1010 = 24 + 20 +2-1 + 2-3 =9.625

Where is the binary point?

Fixed?

Very limited

Moving?

How do you show where it is?

Floating Point

+/- .significand x 2exponent

Point is actually fixed between sign bit and body of mantissa

Exponent indicates place value (point position)

Floating Point Examples

26
Signs for Floating Point

Exponent is in excess or biased notation

e.g. Excess (bias) 127 means

8 bit exponent field

Pure value range 0-255

Subtract 127 to get correct value

Range -127 to +128

Normalization

FP numbers are usually normalized

i.e. exponent is adjusted so that leading bit (MSB) of mantissa is 1

Since it is always 1 there is no need to store it

(c.f. Scientific notation where numbers are normalized to give a single digit before the decimal point

e.g. 3.123 x 103)

Accuracy

Accuracy

The effect of changing lsb of mantissa

23 bit mantissa 2-23 1.2 x 10-7

About 6 decimal places

Maximum Value is determined by the exponent

For comparison, Figure 3.6 indicates the range of numbers that can be represented in a 32-bit word.

27
Figure 3.6 Expressible Numbers in Typical 32-Bit Formats

IEEE Standard for Binary Floating-Point Representation


IEEE 754
Standard for floating point storage

32 and 64 bit standards

8 and 11 bit exponent respectively

Extended formats (both mantissa and exponent) for intermediate results

IEEE 754 Formats

3.5. Floating Point Arithmetic


FP Arithmetic +/-
Check for zeros

Align significands (adjusting exponents)

Add or subtract significands

Normalize result

28
FP Addition & Subtraction Flowchart

Figure 3.7 FP addition and subtraction

FP Arithmetic x/
Check for zero

Add/subtract exponents

Multiply/divide significands (watch sign)

Normalize

Round

All intermediate results should be in double length storage

29
Floating Point Multiplication Flowchart

30
Floating Point Division Flowchart

31
CHAPTER 04
[INSTRUCTION SETS AND ADDRESSING MODES]
Chapter Description
From a programmer point of view, the best way to understand the operation of a processor is to learn
the machine instruction set that it executes. So in this chapter we will study this instruction sets.
Architectural issues such as instruction set design and data types are covered.

4.1 Instruction sets

4.1.1 Introduction
Instructions?

Specify operations to be performed by a computer

Words of a computers language

Instruction set

Collection of the instructions of a computer

The complete collection of instructions that are understood by a CPU

Elements of an Instruction

Operation code (opcode)

Specifies the operation to be performed

ADD,SUB,MUL,,,,,,,,,

Addresses (operands)

Provide more information about the operation

May include:

Source operands: specify where operands come from

Destination operands: specify where results go

Next instruction reference: specifies where to fetch next instruction from

Operation Code (Opcode) Addresses (operands)

Instructions to be read by a computer contain strings of 1s and 0s (They are numbers) (Machine
instructions)

Symbolic representations of machine instructions are used for convenience (assembly language)

32
Even more convenient (High-level languages)

void main()
{
Compiler Assembler main:
int a,b,c; 0567
ADD c,a,b
c = a+b;
High-level}language Assembly language Machine language

4.1.2 Instruction Format


How long is an instruction? How many operands?

Defines the layout of the bits of an instruction in terms of its constituent fields (What does each field
represent and how many bits is it?)

Common Instruction formats:

1. Zero operand:

Opcode

2. One operand:

Opcode Address

3. Two operands:

Opcode Address1 Address2

4. Three operands:

Opcode Add.1 Add.2 Add.3

Instruction Representation
In machine code each instruction has a unique bit pattern

For human consumption (well, programmers anyway) a symbolic representation is used

33
e.g. ADD, SUB, LOAD

Opcodes are represented by abbreviations, called mnemonics that indicate the operation.
Common examples include
ADD Add

SUB Subtract

MUL Multiply

DIV Divide

LOAD Load data from memory

STOR Store data to memory

Operands can also be represented in this way

ADD A,B

Number of Addresses
One of the traditional ways of describing processor architecture is in terms of the number of addresses
contained in each instruction

4.1.3 Instruction Types


Common types:

Data transfer(Data movement)

Arithmetic

Logical

Input/output

Transfer of control(Program flow control)

System control

Data transfer

Copy values from one location to another


(E.g. MOV, LEA, IN/OUT, PUSH/POP)

34
MOV destination, source

Destination: can be register or memory location

Source: can be register, memory location or an immediate number

E.g. MOV CX, 20 place the value 20 in CX register (CX20)

MOV CX, [20] copy value at memory location 20 to CX

PUSH source

Used to transfer data to stack

Source: can be register or memory location

E.g. PUSH CX copy CX to stack

POP destination

Used to retrieve data from stack

Destination: can be register or memory location

E.g. POP BX copy data on top of stack to register BX

Arithmetic
(E.g. ADD, INC, SUB, DEC, MUL, DIV)

ADD destination, source

Destination: can be register or memory location

Source: can be register, memory location or an immediate number

E.g. ADD CX, BX (CXCX+BX)

DEC destination

Destination: can be register or memory location

E.g. DEC CX (CXCX-1)

MUL source

Source: can be register or memory location

Destination is an accumulator register, AX

35
E.g. MUL BL (AXAL x BL)

Logical
Operate on a bit-by-bit basis

(E.g. AND, OR, XOR, NOT, SHR, SHL)

E.g. AND CX, BX (CXCX AND BX)

Input /output
Instructions to read data from an input module and to write data to an output module

(E.g. IN, OUT)

IN accumulator, port OUT port, accumulator

Port: address of the I/O module (8-bits for 8086)

Transfer of control

Instructions discussed so far execute sequentially

Transfer of control instructions change the sequence of execution (update value of the program
counter (PC))

Common transfer of control instructions


Branch (Jump) instructions

Procedure call instruction

Jump instructions
There are two types of jump, unconditional and conditional in unconditional jump, as the instruction is
executed, the jump always takes place to change the execution sequence.

36
Unconditional jump

Figure 4.1 Unconditional jump program sequence

In unconditional jump, as the instruction is executed, the jump always takes place to change the execution
sequence

Conditional jump

branch is made if a certain condition is met

E.g. JZ target (Jump to target address if result of previous operation is zero)

E.g. SUB CX, 32

JZ label

label: MOV BX, 10

37
Figure 4.2 Conditional jump program sequence.

Procedure call Instructions


Instruct the processor to go and execute an entire procedure and return

CALL instruction is used to call the subroutine.

RET instruction must be included at the end of the subroutine to initiate the return

Sequence to the main program environment

4.2 Addressing modes

An addressing mode is a method of specifying an operand.


How is the address of an operand specified?

Common addressing modes

Immediate

Direct

38
Register

Register indirect

Displacement

Stack

4.2.1 Immediate addressing modes

Operand is specified in the instruction itself

E.g. MOV R1, 100 (R1100)

The operand is part of the instruction instead of the contents of a register or a Memory location

Advantage

Does not require extra memory reference to fetch the operand

Drawback

Only a constant can be supplied

The number of values is limited by the size of the operand field

4.2.2 Direct addressing modes

The value of the effective address is encoded directly in the instruction.


Memory
E.g. MOV R1, [100]
CPU 99 78
Memory address R1 96
100 96
Advantage
101 0
Requires only one memory reference

Drawback

Can address a limited number of memory locations (relatively smaller address space)

4.2.3 Register addressing modes


Register address is specified in the address field of the instruction

E.g. MOV R1, 100 (R1100)

Register Address
39
Most common addressing mode in most computers

4.2.4 Register indirect addressing modes


Register that holds memory address is specified in the address field of the instruction

E.g. MOV R1, [R2]

Memory Address

Advantage

Can address larger number of memory locations compared with direct addressing

4.2.5 Displacement addressing modes


Combines direct addressing and register indirect addressing

Main memory address is added with a displacement value to get the effective address in memory E.g.
MOV R1, [R2+100]

Displacement value

4.2.6 Stack addressing modes


An implied addressing that refers to the top of a stack

It is implied that the address is contained in side a stack pointer register

E.g. PUSH R1

40
x86 addressing modes
Register, Immediate

E.g. MOV AX, 0546

Direct

E.g. ADD AX, [0546]

Register Indirect

E.g. MOV [BX], AX

Displacement

Indexed

E.g. MOV AX, [R+0645] where R is an Index register (SI or DI)

Based

E.g. MOV AX, [R+0645] where R is a base register (BX or BP)

41
CHAPTER 05

[PROCESSOR ORGANIZATION & INSTRUCTION CYCLE]


Chapter description
This chapter is devoted to a discussion of the internal structure and function of the processor. The
chapter describes the use of registers as the CPUs internal memory and then pulls together all of the
material covered so far to provide an overview of CPU structure and function. The overall organization
(ALU, register file, control unit) is reviewed. Then the organization of the register file is discussed. The
instruction cycle is examined to show the function and interrelationship of fetch, indirect, execute, and
interrupt cycles. Finally, the use of pipelining to improve performance is explored in depth.

5.1 Processor organization


What is a processor (CPU) required to do?

Fetch and execute instructions

PC, IR Fetch Instruction From memory

Interpret (decode)
Decoding circuit Instruction

MAR, MBR [Fetch Data] From memory, I/O

ALU [Process Data]

MAR, MBR [Write Data] To memory, I/O

CPU contains:

Registers

Internal processor memory

ALU

performs arithmetic and logic operations (processes data)

Operates only on data in registers

ALU with its inputs and outputs is termed as a data path

42
Control Unit

Decodes instructions, generates control signals to control the processor

Internal Bus

Interconnects CPU parts

5.2 Register Organizations


Types of registers

User-visible registers
They can be directly accessed (read or written to) by programmers (instructions)

Used to minimize memory reference

Control registers
Used by control unit to control operation of the processor

Status (flag) registers


Indicate the current state (status) of the processor

No clean separation of registers into these categories (depends on the processor)


User-visible registers

General purpose registers


Can be used for a variety of functions

(hold data, used for addressing)

Data registers
Hold only data

e.g. Accumulator (working) register used to store intermediate ALU results

Address registers
Only used for addressing

e.g. Segment registers (SS, DS, CS and ES in x86)


Index registers (SI, DI in x86)

Stack pointer

43
Control registers
Program Counter (PC): Contains address of next instruction to be fetched
Instruction Register (IR): Temporarily holds most recently fetched instruction
Memory Address Register (MAR): Specifies the address in memory of the word to be written
from or read into the MBR
Memory Buffer Register (MBR): Contains a word to be stored in memory or is used to receive a
word from memory

Status registers
e.g. Flag register (x86), CPSR(ARM)

Flags : Indicate the occurrence of an event in the CPU

Carry flag (CF), Zero flag (ZF), Sign flag (SF), Interrupt flag (IF), Overflow flag (OF)

Used by branch (jump) instructions and interrupts (CPU checks the appropriate flags when a
conditional branch instruction is encountered or when interrupt is enabled)

5.3 Instruction cycle and Pipeline


Instruction cycle
In Section 2.2, we described the processors instruction cycle .To recall, an instruction cycle includes the
following stages:

Fetch: Read the next instruction from memory into the processor.
Execute: Interpret the opcode and perform the indicated operation.
Interrupt: If interrupts are enabled and an interrupt has occurred, save the current process state
and service the interrupt.

Instruction Cycle with Interrupt

44
Instruction Pipelining
Review
Executiontime for a program=no .of instructions CPI clock period
Where CPI: Average clock cycle per instruction

e.g. Suppose a program has 10 instructions with the following relationship between instructions and
clock cycles required to execute each instruction

To reduced execution time:

Reduce clock period (Increase clock frequency)


(Improve response time)
Reduce CPI (execute more instructions with the same number of clock cycles)
(Improve throughput)
One approach to reduce CPI is to overlap execution of instructions (pipelining)

Pipelining
Instruction cycle has several stages (fetch, decode, execute)
Let instructions execute one after the other
(assume one clock cycle per stage (3 clock cycles per instruction) )

Clk

5 clock cycles for 3 instructions (CPI is reduced)

Additional hardware is required for a pipelined processor (pipeline registers between the stages)

45
In practice the three stages may take different times (clock cycles): execution may take more
time than decoding. This would reduce the effectiveness of the pipeline

Currently decoded instruction has to wait until previous instruction is executed


Throughput is limited by the slowest stage
If we have more stages:
The stages will be of more nearly equal duration
Program execution time is reduced more
e.g. 5-stage pipeline

Operands can be fetched from memory or from registers

Operand can be written to memory or to registers

5-stage Pipeline
Assume:

All instructions require all the five stages


Equal duration for each stage

46
Assuming one clock cycle per stage, 3 instructions would require 7 clock cycles

Pipeline Performance

Assume an instruction goes through k stages and each stage has a duration of

Without pipelining, execution time for n instructions (T) will be:


T =nk

With pipelining
T k ,n=( k + ( n1 ) )

e.g. For =1, k=5, n=10

T =5 10=50
T k ,n=( 5+ ( 101 ) ) =14

50
Speed up factor of =3.57
14

With pipelining the program is executed 3.57 times faster than without pipelining

T nk
Speed up factor (S k )= =
T k , n k + ( n1 )

47
Pipeline Hazards
Some things could go wrong on real pipelined executions
A pipeline hazard occurs when the pipeline, or some portion of the pipeline, must stall (be idle)
because conditions do not permit continued execution

Pipeline hazards:

Resource (Structural) hazards

Data hazards

Control hazards

Resource Hazards
Occur when two or more instructions that are already in the pipeline need the same resource
o e.g. Memory access

Data Hazards
Occur when one instruction depends on data value produced by a preceding instruction
o e.g.
ADD R1,R2 (R1=1)
ADD R3,R1 (R3=3)

Such hazard is termed as read after write (RAW) hazard since current instruction must wait to
read data until after a previous instruction writes the correct data
The hazard occurs if read takes place before the write operation is complete

48
Other types of data hazards:
Write after read (WAR)

Write after write (WAW)

Approaches for handling data hazards:


Avoid hazard

Detect and stall

Detect and forward

49
CHAPTER 06

[COMPUTER MEMORY]
6.1 Computer Memory System Overview
Memory is used to store data and instructions in computers.

There are different types of memory within a computer: registers, cache, and main memory

(Primary memory), secondary memory (external memory).A computer may have all or a subset of these
memory types.

The different types of memories can be characterized by their speed, cost per bit and Capacity.

Speed: How fast can data be accessed from the memory. This is defined by the memory access time
(latency). Access time is the time from the instant that an address is presented to the memory to the
instant that data have been stored or made available for use.

The above characteristics for a certain memory type depend on the technology used to manufacture the
memory. The technologies used to manufacture the memory types mentioned above and their
characteristics is summarized below.

Cache: Uses SRAM (Static Random Access memory)

SRAM is made up of transistors (4 to 6 transistors per bit)

Fast (Approximately 2 ns access time)

Expensive ($5 per Megabyte)

Main Memory: Uses DRAM (Dynamic Random Access Memory)

DRAM is made up of transistors and capacitors (1 transistor and 1 capacitor per bit)

Slower than SRAM (Approximately 60 ns access time)

Less expensive than SRAM ($0.012 per Megabyte)

Secondary Memory: Different types (Flash memory, magnetic disks like a hard disk, optical disks like a
CD-ROM)

These memory types are the slowest

They are the least expensive

They are used when large amount of data have to be stored (also when frequent access is not necessary)

50
We want to have fast memory with big capacity. But as you can see, as the speed for a certain memory
type increases the price also increases. Having 100s of Gigabytes of the fastest memory only will be very
expensive. Therefore a hierarchy of different types of memory is used in computers.

Fig: hierarchy of different memories

Use a small array of SRAM (cache), larger DRAM (main memory) and even larger secondary memory to
fulfill the need for speed and capacity with a reasonable cost.

Secondary memory permanently holds programs and data used by the computer (it is non-volatile).

Main memory holds instructions for current programs run by the computer (it is volatile).

Cache holds a copy of portion of main memory most recently accessed by the computer. Since, according
to the principle of locality of reference, the most recently accessed memory location tend to be accessed
again soon, keeping this data in faster memory (cache) decreases the average memory access time.

The principle of locality of reference states that, if a data location is referenced, then the same location
or data locations with nearby addresses will tend to be referenced soon. This arises from natural
program structures. For example most programs contain loops, so instructions and data are likely to be
accessed repeatedly.

6.2 Cache Memory


A cache memory is logically located between a CPU and main memory (physically it is usually embedded
inside the CPU). It contains a copy of portions of memory most recently accessed by the CPU.

A processor may have a single cache or multiple levels of cache. Also there may be separate instruction
and data cache (called split caches), or a single cache to hold both instruction and data (called unified
cache).

51
Fig: cache memory

When the CPU attempts to read a word from memory, a check is made to determine if the word is in
cache. If so, the word is delivered to the processor (this is called a Hit). If the data is not in cache (this is
called a Miss), a block of memory (several memory words) consisting of that data is read into the cache
and then the required word is delivered to the CPU.

6.2.1 Cache structure

Let the main memory of a computer contain 2n addressable words, with each word having a unique n-bit
address. This memory can be divided into blocks, each block containing a number of addressable words.

Let K = the number of words per block. This implies that there are 2 n/K = M blocks in main memory as
shown in the following diagram.

A cache memory consists of multiple tag/block pairs called cache lines. Let us assume a cache has L lines.
The cache structure is shown below.

52
Each cache line contains control bits, a tag field used in addressing, and a block of memory data.

The number of cache lines is considerably less than the number of main memory blocks

(L<<M). At any time, some subset of the blocks of memory resides in lines in the cache. If a word in a
block of memory is read, that block is transferred to one of the lines of the cache.

Because there are more blocks than lines, an individual line cannot be uniquely and permanently
dedicated to a particular block. Thus, each line includes a tag that identifies which particular block is
currently being stored. The tag is usually a portion of the main memory address.

Memory (main memory) address is specified in instructions. A processor has to know where in cache to
look for a certain data, given the memory address. Therefore, the memory address specified in
instructions has to be translated into cache line number. This translation of memory address into a cache
line is termed as mapping.

There are different mapping techniques:

Direct Mapping,

Associative Mapping and

Set Associative Mapping

53
CHAPTER 7
[Input/output]
In addition to the processor and a set of memory modules, the third key element of a computer system
is a set of I/O modules. Each module interfaces to the system bus or central switch and controls one or
more peripheral devices.

7.1 EXTERNAL DEVICES


I/O operations are accomplished through a wide assortment of external devices that provide a means of
exchanging data between the external environment and the computer.

An external device connected to an I/O module is often referred to as a peripheral device or, simply, a
peripheral.

We can broadly classify external devices into three categories:

Human readable: Suitable for communicating with the computer user

Machine readable: Suitable for communicating with equipment

Communication: Suitable for communicating with remote devices

7.2 I/O MODULES

Need to Interface to CPU and Memory

Interface to one or more peripherals

The major functions or requirements for an I/O module fall into the following categories:

Control and timing

54
Processor communication

Device communication

Data buffering

Error detection

7.2.1 I /O steps

The control of the transfer of data from an external device to the processor might involve the following
sequence of steps:

CPU checks I/O module device status


I/O module returns status
If ready, CPU requests data transfer
If ready, CPU requests data transfer
I/O module gets data from device
I/O module transfers data to CPU

7.3 I /O techniques
There are three principal I/O techniques:

Programmed I/O, in which I/O occurs under the direct and continuous control of the program requesting
the I/O operation.

interrupt-driven I/O, in which a program issues an I/O command and then continues to execute, until it
is interrupted by the I/O hardware to signal the end of the I/O operation and

Direct memory access (DMA), in which a specialized I/O processor takes over control of an I/O
operation to move a large block of data.

When the processor, main memory, and I/O share a common bus, two modes of addressing are possible:
memory mapped and isolated.

Memory mapped I/O

Uses instructions that transfers data between microprocessor & memory

Treated as memory location in the memory map

Treated as memory location in the memory map

Portion of memory is used as I/O map


Isolated I /O

I/O locations are isolated from memory system in a separate address space.

55
User can expand the memory to its full size

Data transferred between I/O and microprocessor must be access by IN/OUT instructions

In PC, isolated I/O ports are used for controlling peripheral device

56
Reference
Computer Organization and Architecture Designing for Performance: William Stallings 8 th Edition
D.A. Patterson & J.L. Hennessy - Computer Architecture

Approved by : ______________________________________________________________

57

Вам также может понравиться