Computer Instruction Set: - An Instruction Has Two Components

Computer Instruction Set
An instruction has two components:

Op-code
e.g.
ADD
Operand(s)
R0
100
The operand field may have the following formats:

1) zero-address
2) one-address
3) two-address
4) three-address
The total number of instructions and the types and formats

of the operands determine the length of an instruction.
The shorter the instruction, the faster the time that it can
be fetched and decoded.
Shorter instructions are better than longer ones:

(i) take up less space in memory
(ii) transferred to the CPU faster
A machine with 2^N instructions must require at least Nbit to encode all the op-codes.
B. Ross COSC 3p92
Instruction sets
Byte ordering
- Big-endian: bytes in word ordered from left-toright eg. Motorola
- Little-endian: bytes in word ordererd right-toleft eg. Intel
Creates havoc when transferring data; need to swap
byte order in transferred words
2.10, 2.11
B. Ross COSC 3p92
Op-code Encoding
1. Block-code technique
To each of the 2^K instructions a unique

binary bit pattern of length K is assigned.
An K-to-2^K decoder can then be used to

decode all the instructions. For example,
instruction 0
instruction 1
instruction 2
3-bit Op-code
3-to-8
decoder
instruction 3
instruction 4
instruction 5
instruction 6
instruction 7
B. Ross COSC 3p92
Op-code Encoding
2. Expanding op-code
technique
Consider an 4+12 bit instruction with a 4-bit opcode and three 4-bit addresses.
Op-code
Address 1
Address 2
Address 3
It can at most encode 16 three-address instructions.
If there are only 15 such three-address instructions,

then one of the unused op-code can be used to
expand to two-address, one-address or zero address
instructions.
1111
Op-code
Address 1
Address 2
Again, this expanded op-code can encode at most

16 two-address instructions. And if there are less
than 16 such instructions, we can expand the opcode further.
1111
1111
Op-code
Address 1
1111
1111
1111
Op-code
B. Ross COSC 3p92
Opcode Encoding
Note that the three address fields may not
necessarily be used to encode a three-address
operand; they canl be used as a single 12-bit oneaddress operand.
Can have some part of the op-code to specify the
instruction format and/or length.
- if there are few two-address instructions, we
may attempt to make them shorter instead
and to use the first two bits to indicate the
instruction length, e.g., 10 means two-address
and 11 means three address.
B. Ross COSC 3p92
Op-code Encoding
Huffman encoding
Given the probabilty of occurrences of each

instruction, it is possible to encode all the
instructions with minimal number of bits, and
with the following property:
Fewer bits are used for most frequently
used instructions and more for the least
frequently used ones.
1
0
1/2
0
1
1
1/4
0
1
1/8
1/8
1/2
1/4
1/16
1/16
1/16
1/16
1/8
1/8
1/4
1/4
HALT JUMP SHIFT NOT AND ADD

0000 0001 0010 0011 010
011
B. Ross COSC 3p92
STO LOAD
10
11
Opcode encoding, Huffman codes
Huffman encoding algorithm:

1. Initialize the leaf nodes each with a
probability of an instruction. All nodes are
unmarked.
2. Find the two unmarked nodes with the
smallest values and mark them. Add a new
unmarked node with a value equal to the sum of
the chosen two.
3. Repeat step (2) until all nodes have been
marked except the last one, which has a value of 1.
4. The encoding for each instruction is found
by tracing the path from the unmarked node (the
root) to that instruction.
- may mark branches arbitrarily with 0, 1
Advantage: minimal number of bits
Disadvantage: must decode instructions bit-by-bit,

(can be slow).
- to decode, must have a logical representation of

the encoded tree, and follow branches as you
decipher bits
B. Ross COSC 3p92
Addressing modes
inherent - an op-code indicates the address of its

operand
CLI
; clear the interrupt flag
immediate - an instruction contains or

immediately precedes its operand value
ADD #250, R1
absolute - an instruction contains the memory

address of its operand
ADD 250, R1
% R1 := R1 + 250;
% R1 := R1 + *(250);
register - an instruction contains the register

address of its operand
ADD R2, R1
B. Ross COSC 3p92
% R1 := R1 + R2;
Addressing Modes
register indirect - the register address in an

instruction specifies the address of its operand
ADD @R2, @R1
% *R1 := *R1 + *R2;
auto-decrement or auto-increment the contents of the register is

automatically decremented or incremented
before or after the execution of the
instruction
MOV (R2)+, R1
% R1 := *(R2);
MOV -(R2), R1
R2 := R2 + k;
% R2 := R2 - k; R1 := *(R2);
indexed - an offset is added to a register to give

the address of the operand
MOV 2(R2), R1
% R1 := R2[2];
base-register - a displacement is added to an

implicit or explicit base register to give the address
of the operand
relative - same as base-register mode except that

the instruction pointer is used as the base register
B. Ross COSC 3p92
Addressing modes
Indirect addressing mode in general also applies to
absolute addresses, not just register addresses; the
absolute address is a pointer to the operand.
The offset added to an index register may be as
large as the entire address space. On the other
hand, the displacement added to a base register is
generally much smaller than the entire address
space.
The automatic modification (i.e., auto-increment or
auto-decrement) to an index register is called
autoindexing.
Relative addresses have the advantage that the
code is position-independent.
B. Ross COSC 3p92
10
Instruction Types
Instructions, of most modern computers, may be

classified into the following six groups:
Data transfer (40% of user program

instructions)
MOV, LOAD
Arithmetic
ADD, SUB, DIV, MUL
Logical
AND, OR, NOT, SHIFT, ROTATE
System-control
Test-And-Set
I/O
Separate I/O space input/output
B. Ross COSC 3p92
11
Instruction Types
Program-control - may be classified into the

following four groups:
Unconditional branch
BRB
NEXT
% branch to the label NEXT
Conditional branch
SOBGTR
R5, LOOP
ADBLEQ
R5, R6, LOOP % repeat until R5>R6
Subroutine call
CALL
SUB
% push PC; branch to SUB
RET
% repeat until R5=0
% pop PC
Interrupt-handling
TRAP
B. Ross COSC 3p92
% generate an internal interrupt
12
Instruction types
Typical branch instructions test the value of some
flags called conditions. Certain instructions cause
these flags to be set automatically.
The registers used in implementing a subroutine
call are called linkage registers, which typically
include the instruction pointer and stack pointer..
The parameters passed between the caller and the
called subroutine are to be established by
programming conventions. Very few computers
support parameter-passing mechanisms in the
hardware.
An external interrupt may be regarded as a
hardware generated subroutine call except that it
may happen asynchronously. When it occurs, the
current state of the computation must be saved
either by the hardware automatically or by a
program (interrupt-service routine) control.
B. Ross COSC 3p92
13
Examples: Intel Pentium 4

back-compatible to 8088 (16 bit, 8 bit data bus),
8086 (16 bit), 80286 (16 bit, larger addr), 80386 (32
bit), ...
3 operating modes:
1. real mode - acts like 8088 (unsafe -- can crash)
2. virtual 8086 - protected
3. protected - acts like Pentium 4
4 privilege levels too (kernel, user, ...)
little endian words

registers: [5.3]
EAX, EBX, ECX, EDX - general purpose, but have
special uses (eg. EAX = arithmetic, ...)
ESI, EDI, EBP, ESP - addr registers
B. Ross COSC 3p92
14
Example: UltraSPARC III

[5.4]
single linear 2^64 memory space
Registers:
32 64-bit general regs, 32 FP regs
global var regs: used by all procedures
register windows: param. passing done via
registers (more later on RISC vs CISC)
B. Ross COSC 3p92
15
Example: 8051
one mode (unprotected)

64 KB address space for programs, 64 KB for data
prog in ROM, data in RAM
lots of memory configurations:
prog 4 Kb ROM, data 128-byte RAM (on-chip)
64 KB external rom , 64 KB ext RAM
others
8 1-byte general registers
R0-R7: 3-bits in instn reference
4 sets of such registers,
2-bits in PSW determine which one is current
permits rapid interrupt processing: register set switching
all registers are addressable in memory space
eg. R0 and addr 0 are same
above 4 reg banks are 16 bytes that are bit-addressable (bits
0 thru 127)
permits status processing w/o fetching entire bytes
Special registers
PSW (prog stat word): set by arith instns
carry, aux carry, reg set, overflow, parity
IE: interrupt enable/disable
eg. all interrupts, serial channels, 2 timers
IP: interrupt priority for each interrupt
low or high
TCON: timer control

TMOD: timer mode (8, 13,or 16 bit)
B. Ross COSC 3p92
16
Pentium: Instruction formats

[5.13]
formats are complex, irregular, with variable-sized
fields (due to historical evolution)
no memory-to-memory instructions
8088/286 - 1 byte opcode
386 - expanding 1111 -> 2 byte opcode
Some fields:
2-bit MOD - 4 modes,8 regs, 8 combination regs
3 bit register REG, R/M
SIB (scale, index, base) array manipulation
codes
1,2,4 more bytes for operands, constants
Not all registers, modes applicable to all
instructions: highly non-orthogonal
B. Ross COSC 3p92
17
Example: PDP-11 instruction formats
16 bit instn size

possibly 1 or 2 16 bit address words follow
8 modes, 8 regs -- regs 6 & 7 are stack, PC
"orthogonal" addressing -- addressing and opcodes
are independent
some instns use expanding opcode
x111 -> use longer opcode
B. Ross COSC 3p92
18
Example: UltraSPARC inst. formats
[5.14]
32-bit instructions; 31 RISC instructions
first 2 bits help decode instruction format
to encode a 32 bit constant, need to do it in 2
separate instructions!
Example: 8051
[5.16]
6 simple formats; 1, 2 or 3 bytes
format 4: 11 bits when no external memory; else
format 5
B. Ross COSC 3p92
19
Example: Pentium addressing
8088/286 are very non-orthogonal, and addressing

possibilties are arbitrary for different registers
old 5.35
386 -- if 16-bit segments used, then use above

- if 32-bit segments, use following...
5.36
B. Ross COSC 3p92
20
Addressing: Pentium
new modes are more regular, general
SIB mechanism: [5.27] --> arrays
scale = 1, 2, 4, 8
multiply scale to Index register
adding to Base register
and then 8 or 32-bit displacement
B. Ross COSC 3p92
21
Examples of addressing
PDP-11
5.33
power comes from ability of addressing modes to

treat stack ptr, PC like any other registers
eg. mode 6 with PC (reg 7)
B. Ross COSC 3p92
22
Addressing & PDP 11

orthogonality permits many variations with one opcode
5.34
B. Ross COSC 3p92
23
Example: UltraSPARC addressing

all instructions use immediate or register
addressing, except those that address memory
only 3 instructions address memory: Load, Store,
and a multiproc. synch
use indirect addressing
register: 5 bits tell which register
13 bit constants for immediate
Example: 8051
5 modes [fig 5-29]
some instns use accumulator implicitly (no code
telling such... means instns are more compact!)
some modes (reg indirect) require operand to be in
bottom 256 bytes of memory, because thats where
registers are residing
64 Kb of memory addressed by loading 2-bit offsets
B. Ross COSC 3p92
24
Addressing: Discussion
PDP-11 is clean, simple; some waste
Pentium: specialized formats, addressing schemes
386 - 32 bit addressing is more general
RISC (Ultra): simpler instructions, fewer modes
Compilers will generate required addressing, so a simple
scheme will suffice
Specialized modes, formats makes instruction parallelism
(pipelining) more difficult
fewer modes preferrable over specialized modes
simplicity means better compilers
Compact Instructions
+ - smaller resource usage
- faster fetch, execution
- - reduce robustness
Larger instructions:
+: simpler formats
less constrained
-: performance waste
B. Ross COSC 3p92
25

Computer Instruction Set: - An Instruction Has Two Components

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Computer Instruction Set: - An Instruction Has Two Components

Загружено:

Авторское право:

Доступные форматы

Computer Instruction Set

An instruction has two components:

The operand field may have the following formats:

The total number of instructions and the types and formats

Shorter instructions are better than longer ones:

B. Ross COSC 3p92

B. Ross COSC 3p92

To each of the 2^K instructions a unique

An K-to-2^K decoder can then be used to

B. Ross COSC 3p92

It can at most encode 16 three-address instructions.

If there are only 15 such three-address instructions,

Again, this expanded op-code can encode at most

B. Ross COSC 3p92

B. Ross COSC 3p92

Given the probabilty of occurrences of each

HALT JUMP SHIFT NOT AND ADD

B. Ross COSC 3p92

Opcode encoding, Huffman codes

Huffman encoding algorithm:

Disadvantage: must decode instructions bit-by-bit,

- to decode, must have a logical representation of

B. Ross COSC 3p92

inherent - an op-code indicates the address of its

; clear the interrupt flag

immediate - an instruction contains or

absolute - an instruction contains the memory

register - an instruction contains the register

B. Ross COSC 3p92

register indirect - the register address in an

% *R1 := *R1 + *R2;

auto-decrement or auto-increment the contents of the register is

indexed - an offset is added to a register to give

base-register - a displacement is added to an

relative - same as base-register mode except that

B. Ross COSC 3p92

B. Ross COSC 3p92

Instructions, of most modern computers, may be

Data transfer (40% of user program

B. Ross COSC 3p92

Program-control - may be classified into the

% branch to the label NEXT

R5, R6, LOOP % repeat until R5>R6

% push PC; branch to SUB

% repeat until R5=0

B. Ross COSC 3p92

% generate an internal interrupt

B. Ross COSC 3p92

Examples: Intel Pentium 4

little endian words

B. Ross COSC 3p92

Example: UltraSPARC III

B. Ross COSC 3p92

one mode (unprotected)

carry, aux carry, reg set, overflow, parity

IE: interrupt enable/disable

eg. all interrupts, serial channels, 2 timers

IP: interrupt priority for each interrupt

TCON: timer control

B. Ross COSC 3p92

Pentium: Instruction formats

B. Ross COSC 3p92

Example: PDP-11 instruction formats

% R1 := R1 + *R2;