Вы находитесь на странице: 1из 25

Computer Instruction Set

An instruction has two components:


Op-code

e.g.

ADD

Operand(s)

R0

100

The operand field may have the following formats:


1) zero-address
2) one-address
3) two-address
4) three-address

The total number of instructions and the types and formats


of the operands determine the length of an instruction.

The shorter the instruction, the faster the time that it can
be fetched and decoded.

Shorter instructions are better than longer ones:


(i) take up less space in memory
(ii) transferred to the CPU faster

A machine with 2^N instructions must require at least Nbit to encode all the op-codes.

B. Ross COSC 3p92

Instruction sets
Byte ordering
- Big-endian: bytes in word ordered from left-toright eg. Motorola
- Little-endian: bytes in word ordererd right-toleft eg. Intel
Creates havoc when transferring data; need to swap
byte order in transferred words

2.10, 2.11

B. Ross COSC 3p92

Op-code Encoding

1. Block-code technique

To each of the 2^K instructions a unique


binary bit pattern of length K is assigned.

An K-to-2^K decoder can then be used to


decode all the instructions. For example,

instruction 0
instruction 1
instruction 2
3-bit Op-code

3-to-8
decoder

instruction 3
instruction 4
instruction 5
instruction 6
instruction 7

B. Ross COSC 3p92

Op-code Encoding

2. Expanding op-code
technique

Consider an 4+12 bit instruction with a 4-bit opcode and three 4-bit addresses.
Op-code

Address 1

Address 2

Address 3

It can at most encode 16 three-address instructions.

If there are only 15 such three-address instructions,


then one of the unused op-code can be used to
expand to two-address, one-address or zero address
instructions.
1111

Op-code

Address 1

Address 2

Again, this expanded op-code can encode at most


16 two-address instructions. And if there are less
than 16 such instructions, we can expand the opcode further.
1111

1111

Op-code

Address 1

1111

1111

1111

Op-code

B. Ross COSC 3p92

Opcode Encoding
Note that the three address fields may not
necessarily be used to encode a three-address
operand; they canl be used as a single 12-bit oneaddress operand.
Can have some part of the op-code to specify the
instruction format and/or length.
- if there are few two-address instructions, we
may attempt to make them shorter instead
and to use the first two bits to indicate the
instruction length, e.g., 10 means two-address
and 11 means three address.

B. Ross COSC 3p92

Op-code Encoding

Huffman encoding

Given the probabilty of occurrences of each


instruction, it is possible to encode all the
instructions with minimal number of bits, and
with the following property:
Fewer bits are used for most frequently
used instructions and more for the least
frequently used ones.

1
0
1/2
0

1
1

1/4
0

1
1/8

1/8

1/2

1/4

1/16

1/16

1/16

1/16

1/8

1/8

1/4

1/4

HALT JUMP SHIFT NOT AND ADD


0000 0001 0010 0011 010
011

B. Ross COSC 3p92

STO LOAD
10
11

Opcode encoding, Huffman codes

Huffman encoding algorithm:


1. Initialize the leaf nodes each with a
probability of an instruction. All nodes are
unmarked.
2. Find the two unmarked nodes with the
smallest values and mark them. Add a new
unmarked node with a value equal to the sum of
the chosen two.
3. Repeat step (2) until all nodes have been
marked except the last one, which has a value of 1.
4. The encoding for each instruction is found
by tracing the path from the unmarked node (the
root) to that instruction.
- may mark branches arbitrarily with 0, 1
Advantage: minimal number of bits

Disadvantage: must decode instructions bit-by-bit,


(can be slow).

- to decode, must have a logical representation of


the encoded tree, and follow branches as you
decipher bits

B. Ross COSC 3p92

Addressing modes

inherent - an op-code indicates the address of its


operand
CLI

; clear the interrupt flag

immediate - an instruction contains or


immediately precedes its operand value

ADD #250, R1

absolute - an instruction contains the memory


address of its operand
ADD 250, R1

% R1 := R1 + 250;

% R1 := R1 + *(250);

register - an instruction contains the register


address of its operand
ADD R2, R1

B. Ross COSC 3p92

% R1 := R1 + R2;

Addressing Modes

register indirect - the register address in an


instruction specifies the address of its operand
ADD @R2, @R1

% *R1 := *R1 + *R2;

auto-decrement or auto-increment the contents of the register is


automatically decremented or incremented
before or after the execution of the
instruction
MOV (R2)+, R1

% R1 := *(R2);

MOV -(R2), R1

R2 := R2 + k;

% R2 := R2 - k; R1 := *(R2);

indexed - an offset is added to a register to give


the address of the operand
MOV 2(R2), R1

% R1 := R2[2];

base-register - a displacement is added to an


implicit or explicit base register to give the address
of the operand

relative - same as base-register mode except that


the instruction pointer is used as the base register

B. Ross COSC 3p92

Addressing modes
Indirect addressing mode in general also applies to
absolute addresses, not just register addresses; the
absolute address is a pointer to the operand.
The offset added to an index register may be as
large as the entire address space. On the other
hand, the displacement added to a base register is
generally much smaller than the entire address
space.
The automatic modification (i.e., auto-increment or
auto-decrement) to an index register is called
autoindexing.
Relative addresses have the advantage that the
code is position-independent.

B. Ross COSC 3p92

10

Instruction Types

Instructions, of most modern computers, may be


classified into the following six groups:

Data transfer (40% of user program


instructions)
MOV, LOAD

Arithmetic
ADD, SUB, DIV, MUL

Logical
AND, OR, NOT, SHIFT, ROTATE

System-control
Test-And-Set

I/O
Separate I/O space input/output

B. Ross COSC 3p92

11

Instruction Types

Program-control - may be classified into the


following four groups:

Unconditional branch
BRB

NEXT

% branch to the label NEXT

Conditional branch
SOBGTR

R5, LOOP

ADBLEQ

R5, R6, LOOP % repeat until R5>R6

Subroutine call
CALL

SUB

% push PC; branch to SUB

RET

% repeat until R5=0

% pop PC

Interrupt-handling
TRAP

B. Ross COSC 3p92

% generate an internal interrupt

12

Instruction types
Typical branch instructions test the value of some
flags called conditions. Certain instructions cause
these flags to be set automatically.
The registers used in implementing a subroutine
call are called linkage registers, which typically
include the instruction pointer and stack pointer..
The parameters passed between the caller and the
called subroutine are to be established by
programming conventions. Very few computers
support parameter-passing mechanisms in the
hardware.
An external interrupt may be regarded as a
hardware generated subroutine call except that it
may happen asynchronously. When it occurs, the
current state of the computation must be saved
either by the hardware automatically or by a
program (interrupt-service routine) control.

B. Ross COSC 3p92

13

Examples: Intel Pentium 4


back-compatible to 8088 (16 bit, 8 bit data bus),
8086 (16 bit), 80286 (16 bit, larger addr), 80386 (32
bit), ...
3 operating modes:
1. real mode - acts like 8088 (unsafe -- can crash)
2. virtual 8086 - protected
3. protected - acts like Pentium 4
4 privilege levels too (kernel, user, ...)

little endian words


registers: [5.3]
EAX, EBX, ECX, EDX - general purpose, but have
special uses (eg. EAX = arithmetic, ...)
ESI, EDI, EBP, ESP - addr registers

B. Ross COSC 3p92

14

Example: UltraSPARC III


[5.4]
single linear 2^64 memory space
Registers:
32 64-bit general regs, 32 FP regs
global var regs: used by all procedures
register windows: param. passing done via
registers (more later on RISC vs CISC)

B. Ross COSC 3p92

15

Example: 8051

one mode (unprotected)


64 KB address space for programs, 64 KB for data
prog in ROM, data in RAM
lots of memory configurations:
prog 4 Kb ROM, data 128-byte RAM (on-chip)
64 KB external rom , 64 KB ext RAM
others
8 1-byte general registers
R0-R7: 3-bits in instn reference
4 sets of such registers,
2-bits in PSW determine which one is current
permits rapid interrupt processing: register set switching
all registers are addressable in memory space
eg. R0 and addr 0 are same
above 4 reg banks are 16 bytes that are bit-addressable (bits
0 thru 127)
permits status processing w/o fetching entire bytes
Special registers
PSW (prog stat word): set by arith instns

carry, aux carry, reg set, overflow, parity

IE: interrupt enable/disable

eg. all interrupts, serial channels, 2 timers

IP: interrupt priority for each interrupt

low or high

TCON: timer control


TMOD: timer mode (8, 13,or 16 bit)

B. Ross COSC 3p92

16

Pentium: Instruction formats


[5.13]
formats are complex, irregular, with variable-sized
fields (due to historical evolution)
no memory-to-memory instructions
8088/286 - 1 byte opcode
386 - expanding 1111 -> 2 byte opcode
Some fields:
2-bit MOD - 4 modes,8 regs, 8 combination regs
3 bit register REG, R/M
SIB (scale, index, base) array manipulation
codes
1,2,4 more bytes for operands, constants
Not all registers, modes applicable to all
instructions: highly non-orthogonal

B. Ross COSC 3p92

17

Example: PDP-11 instruction formats

16 bit instn size


possibly 1 or 2 16 bit address words follow
8 modes, 8 regs -- regs 6 & 7 are stack, PC
"orthogonal" addressing -- addressing and opcodes
are independent
some instns use expanding opcode
x111 -> use longer opcode

B. Ross COSC 3p92

18

Example: UltraSPARC inst. formats

[5.14]
32-bit instructions; 31 RISC instructions
first 2 bits help decode instruction format
to encode a 32 bit constant, need to do it in 2
separate instructions!

Example: 8051
[5.16]
6 simple formats; 1, 2 or 3 bytes
format 4: 11 bits when no external memory; else
format 5

B. Ross COSC 3p92

19

Example: Pentium addressing

8088/286 are very non-orthogonal, and addressing


possibilties are arbitrary for different registers

old 5.35

386 -- if 16-bit segments used, then use above


- if 32-bit segments, use following...

5.36

B. Ross COSC 3p92

20

Addressing: Pentium
new modes are more regular, general
SIB mechanism: [5.27] --> arrays
scale = 1, 2, 4, 8
multiply scale to Index register
adding to Base register
and then 8 or 32-bit displacement

B. Ross COSC 3p92

21

Examples of addressing
PDP-11

5.33

power comes from ability of addressing modes to


treat stack ptr, PC like any other registers
eg. mode 6 with PC (reg 7)

B. Ross COSC 3p92

22

Addressing & PDP 11


orthogonality permits many variations with one opcode

5.34

B. Ross COSC 3p92

23

Example: UltraSPARC addressing


all instructions use immediate or register
addressing, except those that address memory
only 3 instructions address memory: Load, Store,
and a multiproc. synch
use indirect addressing
register: 5 bits tell which register
13 bit constants for immediate

Example: 8051
5 modes [fig 5-29]
some instns use accumulator implicitly (no code
telling such... means instns are more compact!)
some modes (reg indirect) require operand to be in
bottom 256 bytes of memory, because thats where
registers are residing
64 Kb of memory addressed by loading 2-bit offsets

B. Ross COSC 3p92

24

Addressing: Discussion
PDP-11 is clean, simple; some waste
Pentium: specialized formats, addressing schemes
386 - 32 bit addressing is more general
RISC (Ultra): simpler instructions, fewer modes
Compilers will generate required addressing, so a simple
scheme will suffice
Specialized modes, formats makes instruction parallelism
(pipelining) more difficult
fewer modes preferrable over specialized modes
simplicity means better compilers

Compact Instructions
+ - smaller resource usage
- faster fetch, execution
- - reduce robustness
Larger instructions:
+: simpler formats
less constrained
-: performance waste

B. Ross COSC 3p92

25

Вам также может понравиться