Академический Документы
Профессиональный Документы
Культура Документы
5
DMIPS
Dhrystone is a synthetic benchmark program for system programming. So
DMIPS measures not instructions per second but gives an idea of how long
overall it will take one processor to execute benchmark program
The industries have adopted the VAX 11/780 as the reference 1 MIPS
machine. The VAX 11/780 achieves 1757 Dhrystones per second.
6
Two source registers (Rn
and Rm) and one result
register Rd
Sign Extend -> converts
signed 8/16 bit to 32 bit
signed value
Barrel shifter =>
preprocess Rm before it
enters to ALU, it performs
shift and rotate
operations
MAC unit => for multiply
and accumulation
operation
7
ARM Architecture
ARM Core under study is ARM7TDMI (32-bit RISC CPU, 3-stage pipeline)
ARM state => Instructions are 32-bit wide and address is word aligned
Thumb state => Instructions are 16-bit and address is half-word aligned
ARM Modes:
Different Modes of ARM processor are defined for specific purpose
User mode => most application softwares run in this mode
8
ARM Architecture
Non exception modes => User, System
Exception modes => Supervisor, IRQ, FIQ, abort, undefined
10
Banked Registers:
11
ARM Architecture
12
ARM Family and Cores
ARM Core Features ARM ISA Thumb
family version version
14
15
ARM - The Barrel Shifter
LSL : Logical Left Shift ASR: Arithmetic Right Shift
CF Destination 0 Destination CF
Destination CF
16
ARM Data Processing Instructions
CMP,CMN,TST & TEQ always update flags (even if ‘S’ is not used as
suffix) and do not alter any register. They use only Rn and OP2.
MOV & MVN use only two operands i.e. Rd and ‘op2’
17
ARM Immediate Operand
Immediate Operand (32-bit):
ARM can not generate all 32-bit constants (32-bit immediate data)
Instruction code contains only 12 bits to specify 32-bit constant
Valid 32-bit constants are obtained by 8-bit constant rotated right even number of
positions i.e. 0,2,4,…..30
32-bit constants from given 8 bit value and 4-bit Rotate code:
if Imm=0x40, Rotate=0xD => 32 bit constant= #4096
if Imm=0xFF, Rotate=0x8 => 32 bit constant= #0x000000FF
Amount of rotation is double than 4-bit field “rotate”
18
ARM Immediate Operand
Range of 32-bit constants for even
rotations i.e. #0, #2 & #30
Conditional Execution:
ARM instructions can be made to execute conditionally by post fixing
them with the appropriate condition code field. (e.g. MOVEQ R0,R1)
Condition checks the status of appropriate flags
If condition is true, normal execution otherwise no execution.
Adv. => Greater pipeline performance and higher code density leading to
higher instructions throughput
20
ARM Conditional Execution
21
ARM Conditional Execution
Set the flags, and then use various conditional codes
CMP r0, # 0 if (a==0) x=0; (here r0 = a, r1= x)
MOVEQ r1, # 0 if (a>0) x=1;
MOVGT r1, #1
Set of Conditional compare instruction
CMP r0, # 4 if (a==4 or a==10)
CMPNE r0, #10 x=0;
MOVEQ r1, # 0
22
ARM Brach Instructions
B <cc> label : branch to label
( MOV LR, PC can be used before above inst. to store return add.)
BL <cc> subroutine_label (LR automatically stores return add.)
24-bit offset field of Instruction code is shift left by 2 to get 26 bit
effective offset (i.e. Total range 226)
± 32 Mbyte range
How to perform longer branches? (use BX Rm)
24
ARM Load & Store Instructions
Data movement between registers and memory
Instruction format : opcode<cc> <size> Rd, <address>
Opcodes:
LDR STR ;32-bit Word load & store
LDRB STRB ;Byte load & store
LDRH STRH ;16-bit Halfword load & store
LDRSB ;Signed byte load
LDRSH ;Signed halfword load
LDRB and LDRH copy 8-bit and 16-bit quantities from memory to
destination register and forces higher bits of destination register to
zero. For LDRSB and LDRSH the higher bits of destination register
is replaced by sign bit
Address:
Formed by base register (Rn) and offset
Base register can be any general purpose register including PC
Offset can be (for 32-bit Word and unsigned Byte)
signed immediate (# 12-bit value)
register or
scaled register (Rm with shift/rotate by # immediate only)
Offset for H,SH & SB :- immediate value (# 8bit) and register
25
Load & Store Instructions
Choice of indexing :- Pre-index, Pre-index write back and post index addressing
Post index and Pre-index write back modify base register value.
Examples:-
LDR R8, [R3, # -3] ; Load R8 from address R3-3 (Pre index)
R3 remains unchanged
LDR R3, [R9], # 4 ; Load R3 from address R9 then R9=R9+4
(post index)
STR R7, [R6, # -1] ! ; Store byte at R6-1 from R7 and then decrement
R6. (pre index with write back)
LDREQB R0, [PC, -R2] ; load R0 from PC-R2 if EQ condition is true
LDR R11, [R3, R5, LSL # 2] ;Load R11 from R3 + R5*4
26
ARM Pre & Post indexing
Pre-indexed: STR r0, [r1, #12]
Offset r0
12 0x20c 0x5 0x5
r1 Source
Base Register
Register 0x200 0x200 for STR
27
ARM Load/Store Multiple
Multiple register load and store with single instruction
Syntax :
LDM <CC> <add_mode> Rn {!} , {registers}
STM <CC> <add_mode> Rn {!} , {registers}
where add_mode :- IA | IB | DA | DB |
Rn (base address) :- must not be PC, must not appear in register list if !
(write back) is specified
Block memory copy: R9 -> points to start source, R4-> total no. of words to be
copied, R10 -> points to start of destination
We first transfer data as bunches (say 8
words) using LDM/STM and register
set R0-R7
If the last bunch has less than 8 words, then
those remaining words can be transferred
using LDR and STR (one word at a time)
28
ARM Load/Store Multiple
MOV R11, R4 // get value of R4 in R11
loop1 : CMP R11, #8 // compare R11 by 8
BLO skip // skip if R11 is less than 8
LDMIA R9!, {R0-R7} // perform eight 32-bit word transfer
STMIA R10!, {R0-R7}
SUBS R11, R11, #8
B loop1
skip: TST R11, # 0x00000000 // is R11 zero?
BEQ halt // end if R11 is zero
loop2: LDR R0, R9! // perform word by word transfer
STR R0, R10!
SUBS R11, R11, #1
BNE loop2
halt: END
ARM Stack Operations
Stack Opertions:
SP replaces Rn, add_mode are:- FD | FA | ED | EA for stack
F and E signify whether SP points to location that is full or empty
Stack is either ascending (growing towards high memory add.) or
descending (growing towards low memory add.)
One of the following pair is used in interrupt routine or handler
30
31
ARM Miscellaneous Instr.
SWP <cc> Rd, Rd, [Rn]
Swap a word between memory and a register Rd
tmp= mem32[Rn], mem32[Rn]=Rd and Rd=tmp
SWPB <cc> Rd, Rd, [Rn] => Swap a byte
The swap instruction is atomic- it reads and writes a memory location in the same
bus cycle. Useful in implementing semaphore and mutual exclusion.
CPSR instructions:
MRS {<cc>} Rd, <CPSR | SPSR> ;copy from PSR to Rd
MSR {<cc>} <CPSR | SPSR>, Rm ; copy from Rm to PSR
32
Assembler Pseudo Instructions:
LDR Rd, =constant
33