Вы находитесь на странице: 1из 9

PDF processed with CutePDF evaluation edition www.CutePDF.

com

A property of MVG_OMALLOOR

Addressing
‰Addressing refers to means to specify location of
operands for instructions
types of addressing are called addressing modes
operands may be input operands for the operation as
Addressing Modes well as results of the operation
‰DSPs contain separate address generation units
(AGU)
arithmetic units dedicated for address calculation
ÂAnalog Devices refers to data address generator,
Lucent Technologies refers address arithmetic
unit
1 2

Implied Addressing Immediate Addressing


‰Operand addresses are implied by the instruction ‰Operand itself is encoded into the instruction
ÂIn AT&T DSP16, the operands for multiplier are ÂIn ADSP-21xx, a constant value is loaded into
always fetched from registers X and Y and the regis-ter AX0 as follows:
result is placed into register P. While the AX0 = 1234
assembler syntax of multiplication is ‰Small data words may be included into instruction
P = X * Y word
it could also be written simply as Typically half of the instruction word width
* Longer data words may be stored into word following
the instruction
ÖRequires two access to program memory (instruction
and operand)
3 4
A property of MVG_OMALLOOR

Memory-
Memory-Direct Addressing Register-
Register-Direct Addressing
‰absolute addressing ‰The operand can be found from a specified
‰The address of operand is encoded into the register
instruction or can be found from a separate data ÂIn TMS320C3X, value in register R1 is subtracted
word following the instruction from value in register R2 and the result is stored
ÂIn ADSP-21xx, an operand in address 1000 is in R2 as follows:
loaded into register AX0 as follows: SUBF R1,R2
AX0 = DM(1000)
‰Small addresses can be encoded into the ‰Important addressing mode in load-store
instruction word processors
‰Long addresses requires a separate data word
following the instruction code

5 6

Register-
Register-Indirect Addressing with
Register-
Register-Indirect Addressing Pre-
Pre- or Post-
Post-Increment
‰ Operand is located in memory address stored in a register ‰ Many DSP algorithms access data arrays sequentially
 address generation unit can increment the address value in address
‰ Special group of registers can be used to store addresses register
(address registers) z before the memory access (pre-increment)
z after the memory access (post-increment)
‰ Most important addressing mode in DSPs  DSP32xx:
 Natural pointing mechanism when working with data arrays A0 = A0 + *R5++; post-increment by one
z Allows automatic modification of pointers A0 = A0 + *R5--; post-decrement by one
 Efficient from instruction set point of view ‰ Some DSPs support address increment or decrement with a value in
another register (offset register, modifier register)
z Few bits are needed to indicate address of operand
 In DSP32xx address register R5 is post-incre-mented with value
 In Lucent DSP32C, value pointed by the contents of stored in register R17 as follows
register R5 is added to value in accumulator A0: A0 = A0 + *R5++R17;
A0 = A0 + *R5 Â In DSP5600x:
MOVE X:-(R0), A1; pre-decrement R0
‰ Pre-increment operation requires typically extra instruction cycle

7 8
A property of MVG_OMALLOOR

Register-
Register-Indirect Addressing with Register-
Register-Indirect Addressing with
Indexing Indexing
‰ effective address is obtained by adding value in address ‰Compilers utilize indexed addressing Stack Frame Pointer

reg-ister and value in another register (index register) for passing parameters in stack
Address Register

together
A stack frame is created each time a
 Values in registers are not modified like in previous addressing STACK

mode subroutine is called


 In DSP5600x: Subroutine can access parameters
MOVE Y1, X:(R6+N6), A1 consistently Pointer to previous stack frame
Number of parameters
‰ Sometimes index value can be part of instruction No need for absolute addresses Parameter #1

 In TMS320C3X: ÂTI C-compiler generates code for Parameter #2

LDI *-AR1(1),R7; copying the first parameter (integer) of


Effective address is contents of register AR1 subtracted by subroutine to register R0 as follows
one. Contents of AR1 is not modified. (AR3 is used as frame pointer):
 Indexed addressing is useful when using the same code for
accessing several data arrays
LDI *+AR3(2),R0;
9 10

Register-
Register-Indirect Addressing with Register-
Register-Indirect Addressing with
Modulo Address Arithmetic Modulo Address Arithmetic
‰ Data buffer management often needed in DSP applications ‰modulo addressing (circular addressing) provides
 In embedded systems, dynamic memory management is expensive hardware support for checking the end of the
‰ Typically need for first-in-first-out (FIFO) buffer
 Programmer maintains two pointers:
address registers are updated with pre- or post-
z Read pointer: address of memory location to be read next
z Write pointer: address of memory location where the next data value is
increment
to be written address generation performs modulo arithmetic on
 each time read or write operation is computation
performed, the programmer needs to
Öprogrammer sees a circular buffer
check whether the end of buffer has
been reached X0 X1 X2 X3
ReadPointer X2 X3
 in the end of buffer, the pointer is X1

initialized to point to the beginning of ReadPointer WritePointer X0 WritePointer

buffer

11 12
A property of MVG_OMALLOOR

Register-
Register-Indirect Addressing with Register-
Register-Indirect Addressing with
Modulo Address Arithmetic Modulo Address Arithmetic
‰ Implementation #1  DSP56001 and DSP96002 have address register triplets Rx, Nx, and Mx,
where x is 0 - 7. The address is stored into Rx, the increment used in
‰ Programmer needs to store the length of circular buffer post-auto-increment addressing is stored into Nx, and the length of
into a special modifier or modulo register modulo-mode addressing buffer is in Mx. These register can be read and
written via the general data bus.
 Each modifier register is associated with one or more address
registers  Auto-increment and modulo-mode arithmetic is performed at an
independent address ALU. Thus, it is possible to access two circular
 Starting address or the buffer is not specified; buffers simultaneously.
address register must contain a valid value before usage
General data bus (24)
z circular buffers must begin at k-word boundaries, where k is smallest
N0 M0 R0 R4 M4 N4
power of two that is equal or greater than the size of buffer N1 M1 ADDRESS R1 R5 ADDRESS M5 N5
ALU ALU
z 48-word circular buffer must reside in 64-word boundary, i.e., starting N2 M2 LOW R2 R6 HIGH M6 N6
N3 M3 R3 R7 M7 N7
address may be 0, 64, 128, 192 etc.
 This kind of mechanism can be found from TI TMS320C3X and 4X,
Motorola, NEC, and Analog Devices DSPs mux mux mux

PAB (16) XAB (16) YAB (16)

13 14

Register-
Register-Indirect Addressing with
Modulo Address Arithmetic Modulo Arithmetic in Lucent DSP16xx
‰Implementation #2
‰alternative mechanism is to utilize start and end
registers data bus

Hardware performs comparison of address against the j YAAU


k
value in end register
Modulo addressing may used for any buffer RAM ADD rb

This mechanism can be found in Lucent DSP16XX and r0


r1 CMP
TI TMS320C5X r2
r3
re

15 16
A property of MVG_OMALLOOR

Register-
Register-Indirect Addressing with Register-
Register-Indirect Addressing with
Modulo Address Arithmetic Bit Reversal
‰Different DSP may support different number of ‰Bit reversed addressing used mainly in FFT
simultaneous circular buffers Memory location
BEFORE permutation
Index mapping
decimal binary
ÂTI TMS320C5x supports two circular buffers and x0 0
X0 0 0 111 111
W80
1
Motorola DSP561xx four buffers. Motorola x1
W40
X1 1 4 001 100
2
DSP5600x and Analog Devices DSP support x2
W40 W82 3
X2 2 2 010 010
x3
eight circular buffers. W20
4
X3 3 6 011 110

x4 X4 4 1 100 001
W20 W81
5
x5 X5 5 5 101 101
W20 W41
x6 6 X6 6 3 110 011
0 1 3
W2 W4 W8
x7 7 7 7 111 111
X7

Bit Reversed permutation

17 18

Register-
Register-Indirect Addressing with
Bit Reversal Short Addressing Modes
‰Hardware implementation may be ‰Some addressing modes require several words in
Real bit reversal between address register and address program memory (instruction code and data word)
bus
Reverse-carry arithmetic in AGU ‰Some DSPs offer short versions which require
 In TMS320C3X, bit-reversed addressing mode notation is only one instruction word
symbol "B". Let us suppose the data be stored in memory
starting from address 60h (= AR2) and the length of FFT is ÖShort versions set some restriction on usage
16 (IR0 contains 8, the half of the length of FFT):
‰Typical short addressing modes are:
*AR2++(IR0)B; AR2 = 0110 0000 = 60 (0. sample)
*AR2++(IR0)B; AR2 = 0110 1000 = 68 (1. sample) Short immediate
*AR2++(IR0)B; AR2 = 0110 0100 = 64 (2. sample)
*AR2++(IR0)B; AR2 = 0110 1100 = 6c (3. sample) Short memory-direct
*AR2++(IR0)B; AR2 = 0110 0010 = 62 (4. sample)
*AR2++(IR0)B; AR2 = 0110 1010 = 6a (5. sample) Paged memory-direct
*AR2++(IR0)B; AR2 = 0110 0110 = 66 (6. sample)
*AR2 ; AR2 = 0110 1110 = 6e (7. sample)

19 20
A property of MVG_OMALLOOR

Short Immediate Addressing Short Memory-


Memory-Direct Addressing
‰If immediate data word is small enough it can be ‰If direct memory address is small enough it can be
packed into the same instruction word with packed into the same instruction word with
operation code operation code
‰Typically negative numbers can also be used ÂIn DSP5600x, at most 6-bit addresses (00H -
3FH) may be used in short direct addressing:
Sign is extended automatically MOVE $10, A
ÂIn DSP5600x, at most 12-bit operands may be ‰Sometimes DSP may provide means to add an
used in immediate addressing: offset to short direct address
MOVE #1234, A ÂIn DSP5600x, the I/O register are at the end of
memory map (FFC0H - FFFFH). MOVEP
instruction may be used with short direct
addressing to access those registers.
21 22

Paged Memory-
Memory-Direct Addressing
‰Special page register is used to hold number of
page or section of memory to be accessed
When access outside this page is required, the page
register must be updated

 In TMS320C2X and C5X, the Data Bus (16)

data memories are divided into DP


pages containing 128 (27) 9
7
words. Programmer sets a 7 LSBs from Instruction Register

page register to point to a 16

specific page and short direct 16-bit data address


addressing mode can be used
to access data within the page.

23
A property of MVG_OMALLOOR

Instruction Set
‰Defines what are natural and efficient operations
on the processor
‰A processor with more instructions is not
necessarily better
Instruction Set and Execution Control ‰Specialized instructions may require more silicon
area
‰Traditional instruction types
Multiplication and arithmetic
Logic operations
Shifting and rotation
Comparison

1 2

Looping Looping
‰DSP applications require repeated execution of ÂSoftware looping takes roughly three time longer
small number of arithmetic or multiplication to execute than hardware looping in the following:
instructions ;SW LOOPING
‰If number of instructions in inner loop is small, MOVE #16,B
overhead in looping lowers the performance LOOP: MAC (R0)+,(R4)+,A
Öall DSPs provide hardware looping instructions DEC B
(zero-overhead looping) JNE LOOP
repeat a single instruction or a block of instructions ;HW LOOPING
without the normal decrement-test-branch sequence RPT #16
loop counter increment, test against end condition, and MAC (R0)+,(R4)+,A
branching are done by hardware
3 4
A property of MVG_OMALLOOR

Single and Multi-


Multi-Instruction Loops Looping Effects
‰ Single-instruction loop repeats execution of one instruction ‰Typically single instruction loop disables interrupts
 Repeated instruction is fetched once from the program memory
 Consecutive executions free the program bus for operand fetch Maximum single-instruction loop lockout time must be
‰ Multi-instruction loop repeats execution of group of con-sidered
instructions
‰Alternatives
‰ Instruction must be refetched on each iteration
 Program bus is not available for operand fetch Multi-instruction loops with kernel of one instruction
‰ Some DSPs limit the number of instructions for hardware Several single-instruction loops
loops
 DSP16xx has a special 15-word buffer for repeated ‰Some DSPs may disable interrupts also during
instructions, thus repeat block can contain at most 15 multi-instruction loops
instruction words

5 6

Loop Nesting Depth Branching


‰ Nested loop is a loop inside another loop ‰ Conditional/Unconditional
‰ Approaches to handle nested loops  unconditional branch is done always
 conditional only when condition is fulfilled
 Directly nestable (Motorola, Analog Devices, NEC uPD7701x)
z Nested hardware loops are allowed ‰ Delayed/Multi-Cycle
z Maximum depths range from three to seven  Multi-cycle requires several cycles to complete
 partially nestable (DSP Group PineDSPCore, TI TMS320C3X,  Delayed allows a number of instructions following the branch to be
executed; branch requires only one cycle
C4x, C5X)
z A single-instruction loop is allowed inside multi-instruc-tion loop ‰ Delayed Branch with Nullify
z Multi-instruction loops can not be nested  TMS320C4x provides conditional delayed branch where
instructions in delay slot are conditionally executed
 software nestable (TI TMS320C3x, TMS320C5x)
z Multi-instruction loop can be nested by saving state of loop registers ‰ PC-Relative
before entering to inner loop  location of branch is not an absolute address
 Non-nestable (TI TMS320C2X, AT&T DSP16xx, DSP32xx)  offset from the current instruction location is used
z Nesting of hardware loops are not supported  needed in position-independent code

7 8
A property of MVG_OMALLOOR

Conditional Instruction Execution Orthogonality


‰ Instruction is executed only if given condition is true ‰ To which extent the processor’s instruction set is
 Branches can be avoided in decision-intensive code
 Useful in DSPs with deep pipelines, where branching pro-duces extra overhead consistent
 Analog Devices ADSP-21xx and ADSP-210xx allow programmer to specify the  the more orthogonal instruction set, the easier the proces-sor is to
conditions under which the instruction is executed. program
‰ Condition codes are built into the instruction opcode  there are fewer inconsistencies and special cases
‰ Extra cycles in execution are needed
 TMS320C30 has a conditional load for fixed-point and floating-point operands  orthogonality is subjective topic
(LDIcond and LDFcond). This is useful when searching a minimum. In the following  consistency and completeness of instruction set
the mini-mum of three operands is searched. AR2 points to the beginning of the z e.g., processor with add instruction, but no subtract instruction would
operand array.
be non-orthogonal
LDI *AR2,R3 ;load the first value
CMPI *AR2+(1),R3 ;compare it to next value  degree to which operands and addressing modes are uni-formly
LDIGT *AR2+(1),R3 ;conditional load available with different operations
CMPI *AR2+(2),R3 ;compare result to 3rd value
LDIGT *AR2+(2),R3 ;minimum in the register R0 z e.g., processor which provides register-indirect address-ing mode for
add but not for subtract, is non-orthogonal
 TMS320C5X and C54X provide conditional execution instruction XC. If the specified
condition is true, the next two single-word or a two-word instructions are executed, ‰ Processors with larger instruction word widths tend to be
otherwise NOPs are executed.
more orthogonal
9 10

Minimize Instruction Word Width Assembly Language Format


‰ reduced number of operations ‰ A) Traditional opcode-operand assembly syntax
 less operations -> fewer bits for opcode z instructions expressed in instruction mnemonic and its operands

 e.g., DSP16xx does not support rotation MPY X0,X0 ;multiply


‰ reduced number of addressing modes ADD P,A ;add product to accumulator
 processors with smaller instruction word width provide less addressing MOV (R0),X0 ;
modes JMP LOOP
 limitations allowable combinations of operations and addressing modes ‰ B) Functional, C-like or algebraic syntax (C-like arithmetic shorthand)
‰ restrictions on source/destination operands P = X0*X0
 e.g., one certain register can be used as address register in certain A = P+A
instructions X0 = *R0
‰ use of mode bits GOTO LOOP
 mode bit specifies what is the actual operation for instruction  Algorithms are expressed close to the mathematical form
 e.g., TMS320C5x does not have separate arithmetic and logical shift  actually code is not C; experienced C programmers may find it frustrating because
instructions, thus the actual operation is defined by the shift mode bit syntax does not support all the C syntax
 e.g., accumulator shift takes shift count from special reg-ister rather than ‰ Assembly language syntax is not related to the instruction set of the processor
from instruction word  a single processor may have two assemblers
‰ most of these features complicate programming, but narrower  as long as assembler generates same binary opcodes, the syntaxes are the same
from processor’s point of view
instruction word reduces required die size

11 12

Вам также может понравиться