Академический Документы
Профессиональный Документы
Культура Документы
TMS320C6000 Architectural
Overview
Learning Objectives
Describe C6000 CPU architecture.
Introduce some basic instructions.
Describe the C6000 memory map.
Provide an overview of the peripherals.
Chapter 2, Slide 2 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
General DSP System Block Diagram
Internal Memory
Internal Buses
P
E
External R
Memory I
Central P
H
Processing E
R
Unit A
L
S
Chapter 2, Slide 3 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Implementation of Sum of Products (SOP)
N
It has been shown in Y = an * xn
Chapter 1 that SOP is the n = 1
key element for most DSP = a1 * x1 + a2 * x2 +... + aN * xN
algorithms.
Two basic
So let’s write the code for
this algorithm and at the operations are required
same time discover the for this algorithm.
C6000 architecture.
(1) Multiplication
(2) Addition
Therefore two basic
instructions are required
Chapter 2, Slide 4 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Implementation of Sum of Products (SOP)
N
So let’s implement the SOP Y = an * xn
algorithm! n = 1
= a1 * x1 + a2 * x2 +... + aN * xN
Chapter 2, Slide 5 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Multiply (MPY)
N
Y = an * xn
n = 1
= a1 * x1 + a2 * x2 +... + aN * xN
Chapter 2, Slide 6 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Multiply (.M unit)
40
Y = an * xn
n = 1
.M
The . M unit performs multiplications in
hardware
.M
Chapter 2, Slide 8 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Add (.L unit)
40
Y = an * xn
n = 1
.M
. .L
. MPY .M a1, x1, prod
.
ADD .L Y, prod, Y
A15
32-bits
. .L
. MPY .M A0, A1, A3
.
ADD .L A4, A3, A4
A15
32-bits
. .L
. MPY .M A0, A1, A3
.
ADD .L A4, A3, A4
A15
32-bits
. .L
.
.
A15
32-bits
Chapter 2, Slide 13 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Load Unit “.D”
A15
.D
32-bits
Data Memory
Chapter 2, Slide 14 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Load Unit “.D”
Register File A
It is worth noting at this
A0 a1 stage that the only way to
A1 x1 access memory is through the
A2 .D unit.
A3 prod .M
Y
. .L
.
.
A15
.D
32-bits
Data Memory
Chapter 2, Slide 15 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Load Instruction
. .L
.
.
A15
.D
32-bits
Data Memory
Chapter 2, Slide 16 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Load Instructions (LDB, LDH,LDW,LDDW)
. .L
.
.
A15
.D
32-bits
Data Memory
Chapter 2, Slide 17 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
FFFFFFFF
16-bits
Chapter 2, Slide 18 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
16-bits
Chapter 2, Slide 19 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
FFFFFFFF
16-bits
Chapter 2, Slide 20 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
Chapter 2, Slide 21 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
Data address
The syntax for the load 1 0
instruction is: 0xA 0xB 00000000
0xC 0xD 00000002
LD *Rn,Rm
0x2 0x1 00000004
0x4 0x3 00000006
Example:
If we assume that A5 = 0x4 then: 0x6 0x5 00000008
(1) LDB *A5, A7 ; gives A7 = 0x00000001 0x8 0x7
(2) LDH *A5,A7; gives A7 = 0x00000201
(3) LDW *A5,A7; gives A7 = 0x04030201
(4) LDDW *A5,A7:A6; gives A7:A6 =
0x0807060504030201
FFFFFFFF
16-bits
Chapter 2, Slide 22 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Using the Load Instructions
FFFFFFFF
16-bits
Chapter 2, Slide 23 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Loading the Pointer Rn
The instruction MVKL will allow a
move of a 16-bit constant into a register
as shown below:
MVKL .? a, A5
(‘a’ is a constant or label)
MVKL a, A5
MVKH a, A5
Chapter 2, Slide 25 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Loading the Pointer Rn
Always use MVKL then MVKH, look at
the following examples:
Example 1
A5 = 0x87654321
Example 2
Chapter 2, Slide 26 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
LDH, MVKL and MVKH
Register File A
A0 a MVKL pt1, A5
A1 x MVKH pt1, A5
A2 MVKL pt2, A6
A3 prod .M MVKH pt2, A6
A4 Y
LDH .D *A5, A0
. .L LDH .D *A6, A1
.
.
MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D
A15
32-bits
Chapter 2, Slide 27 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Creating a loop
Chapter 2, Slide 28 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Creating a loop
Chapter 2, Slide 29 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
What are the steps for creating a loop
Chapter 2, Slide 30 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
1. Create a label to branch to
MVKL pt1, A5
MVKH pt1, A5
MVKL pt2, A6
MVKH pt2, A6
Chapter 2, Slide 31 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
2. Add a branch instruction, B.
MVKL pt1, A5
MVKH pt1, A5
MVKL pt2, A6
MVKH pt2, A6
Chapter 2, Slide 32 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Which unit is used by the B instruction?
MVKL pt1, A5
Register File A
MVKH pt1, A5
A0 a .S MVKL pt2, A6
A1 x
MVKH pt2, A6
A2
prod .M
A3 .M
Y loop LDH .D *A5, A0
. LDH .D *A6, A1
. .L
. .L MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D B .? loop
A15 .D
32-bits
Data Memory
Chapter 2, Slide 33 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Which unit is used by the B instruction?
MVKL .S pt1, A5
Register File A
MVKH .S pt1, A5
A0 a .S MVKL .S pt2, A6
A1 x
MVKH .S pt2, A6
A2
prod .M
A3 .M
Y loop LDH .D *A5, A0
. LDH .D *A6, A1
. .L
. .L MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D B .S loop
A15 .D
32-bits
Data Memory
Chapter 2, Slide 34 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
3. Create a loop counter.
MVKL .S pt1, A5
Register File A
MVKH .S pt1, A5
A0 a .S MVKL .S pt2, A6
A1 x
MVKH .S pt2, A6
A2 MVKL .S count, B0
prod .M
A3 .M
Y loop LDH .D *A5, A0
. LDH .D *A6, A1
. .L
. .L MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D B .S loop
A15 .D
32-bits
B registers will be introduced later
Data Memory
Chapter 2, Slide 35 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
4. Decrement the loop counter
MVKL .S pt1, A5
Register File A
MVKH .S pt1, A5
A0 a .S MVKL .S pt2, A6
A1 x
MVKH .S pt2, A6
A2 MVKL .S count, B0
prod .M
A3 .M
Y loop LDH .D *A5, A0
. LDH .D *A6, A1
. .L
. .L MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D SUB .S B0, 1, B0
A15 .D
B .S loop
32-bits
Data Memory
Chapter 2, Slide 36 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Make the branch conditional based on the
value in the loop counter
What is the syntax for making instruction
conditional?
[condition] Instruction Label
e.g.
[B1] B loop
Chapter 2, Slide 37 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Make the branch conditional based on the
value in the loop counter
The condition can be inverted by adding the
exclamation symbol “!” as follows:
[!condition] Instruction Label
e.g.
[!B0] B loop ;branch if B0 = 0
[B0] B loop ;branch if B0 != 0
Chapter 2, Slide 38 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Make the branch conditional
MVKL .S2 pt1, A5
Register File A
MVKH .S2 pt1, A5
A0 a .S MVKL .S2 pt2, A6
A1 x
MVKH .S2 pt2, A6
A2 MVKL .S2 count, B0
prod .M
A3 .M
Y loop LDH .D *A5, A0
. LDH .D *A6, A1
. .L
. .L MPY .M A0, A1, A3
ADD .L A4, A3, A4
.D SUB .S B0, 1, B0
A15 .D
[B0] B .S loop
32-bits
Data Memory
Chapter 2, Slide 39 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
More on the Branch Instruction (1)
Chapter 2, Slide 40 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
More on the Branch Instruction (2)
32-bit
5-bit register
B code
Chapter 2, Slide 41 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Testing the code
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
This code performs the following MVKL .S2 count, B0
operations:
loop LDH .D *A5, A0
a0*x0 + a0*x0 + a0*x0 + … + a0*x0 LDH .D *A6, A1
MPY .M A0, A1, A3
However, we would like to perform: ADD .L A4, A3, A4
a0*x0 + a1*x1 + a2*x2 + … + aN*xN SUB .S B0, 1, B0
[B0] B .S loop
Chapter 2, Slide 42 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Modifying the pointers
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
MVKL .S2 count, B0
Chapter 2, Slide 43 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Indexing Pointers
Pointer
Syntax Description
Modified
*R Pointer No
Chapter 2, Slide 44 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Indexing Pointers
Pointer
Syntax Description
Modified
*R Pointer No
*+R[disp] + Pre-offset No
*-R[disp] - Pre-offset No
Chapter 2, Slide 49 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Store the final result
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
MVKL .S2 count, B0
Chapter 2, Slide 50 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Store the final result
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
MVKL .S2 count, B0
Chapter 2, Slide 51 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Store the final result
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
MVKL .S2 pt3, A7
MVKH .S2 pt3, A7
MVKL .S2 count, B0
The Pointer A7 is now initialised.
loop LDH .D *A5++, A0
LDH .D *A6++, A1
MPY .M A0, A1, A3
ADD .L A4, A3, A4
SUB .S B0, 1, B0
[B0] B .S loop
STH .D A4, *A7
Chapter 2, Slide 52 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
What is the initial value of A4?
MVKL .S2 pt1, A5
MVKH .S2 pt1, A5
MVKL .S2 pt2, A6
MVKH .S2 pt2, A6
MVKL .S2 pt3, A7
MVKH .S2 pt3, A7
A4 is used as an accumulator, MVKL .S2 count, B0
so it needs to be reset to zero. ZERO .L A4
loop LDH .D *A5++, A0
LDH .D *A6++, A1
MPY .M A0, A1, A3
ADD .L A4, A3, A4
SUB .S B0, 1, B0
[B0] B .S loop
STH .D A4, *A7
Chapter 2, Slide 53 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Increasing the processing power!
Register File A
A0
.S1
A1
A2
How can we add
A3 .M1
more processing
A4 power to this
. .L1
processor?
.
.
.D1
A15
32-bits
Data Memory
Chapter 2, Slide 54 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Increasing the processing power!
Register File A
A0
.S1
A1 (1) Increase the clock
A2 frequency.
A3 .M1
A4 (2) Increase the number
of Processing units.
. .L1
.
.
.D1
A15
32-bits
Data Memory
Chapter 2, Slide 55 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
To increase the Processing Power, this processor has two
sides (A and B or 1 and 2)
Register File A Register File B
A0 B0
.S1 .S2
A1 B1
A2 B2
A3 .M1 .M2 B3
A4 B4
. .L1 .L2 .
. .
. .
.D1 .D2
A15 B15
32-bits 32-bits
Data Memory
Chapter 2, Slide 56 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Can the two sides exchange operands in order to increase
performance?
Register File A Register File B
A0 B0
.S1 .S2
A1 B1
A2 B2
A3 .M1 .M2 B3
A4 B4
. .L1 .L2 .
. .
. .
.D1 .D2
A15 B15
32-bits 32-bits
Data Memory
Chapter 2, Slide 57 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
The answer is YES but there are limitations.
To exchange operands between the two
sides, some cross paths or links are
required.
Chapter 2, Slide 59 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320C67x Data-Path
Chapter 2, Slide 60 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Cross Paths
Data cross paths only apply to the .L, .S
and .M units.
The data cross paths are very useful,
however there are some limitations in
their use.
Chapter 2, Slide 61 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Cross Path Limitations
<dst>
.L1 <src>
A
.M1 <src>
.S1 2x
Chapter 2, Slide 62 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Cross Path Limitations
<dst>
.L1 <src>
A
.M1 <src>
.S1 2x
eg:
ADD .L1x A0,A1,B2
MPY .M1x A0,B6,A9 1x
SUB .S1x A8,B2,A8
|| ADD .L1x A0,B0,A2
B
|| Means that the SUB and ADD
belong to the same fetch packet,
therefore execute
Chapter 2, Slide 63 simultaneously. Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Cross Path Limitations
<dst>
.L1 <src>
A
.M1 <src>
.S1 2x
eg:
ADD .L1x A0,A1,B2
MPY .M1x A0,B6,A9 1x
SUB .S1x A8,B2,A8
|| ADD .L1x A0,B0,A2
B
NOT VALID!
Chapter 2, Slide 64 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Data Cross Paths for both sides
<dst>
.L1 <src>
A
.M1 <src>
.S1 2x
1x
<src>
<dst>
.L2
.M2 <src> B
.S2
Chapter 2, Slide 65 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Address cross paths
Data
Addr
A
.D1
LDW.D1T1 *A0,A5
STW.D1T1 A5,*A0
Chapter 2, Slide 66 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Load or store to either side
Data1 A5
DA1 = T1 *A0
A
.D1
DA2 = T2
LDW.D1T1 *A0,A5
LDW.D1T2 *A0,B5
B
Data2 B5
Chapter 2, Slide 67 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Standard Parallel Loads
Data1 A5
DA1 = T1 *A0
A
.D1
DA2 = T2 *B0
.D2
B
B5
LDW.D1T1 *A0,A5
|| LDW.D2T2 *B0,B5
Chapter 2, Slide 68 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Parallel Load/Store using address cross paths
Data1 A5
DA1 = T1 *A0
A
.D1
DA2 = T2 *B0
.D2
B
B5
LDW.D1T2 *A0,B5
|| STW.D2T1 A5,*B0
Chapter 2, Slide 69 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Fill the blanks ... Does this work?
Data1
DA1 = T1 *A0
A
.D1
DA2 = T2 *B0
.D2
B
LDW.D1__ *A0,B5
|| STW.D2__ B6,*B0
Chapter 2, Slide 70 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Not Allowed!
Parallel accesses: both cross or neither cross
Data1
*A0
A
.D1
DA2 = T2
*B0
.D2
B5
B
LDW.D1T2 *A0,B5
B6
|| STW.D2T2 B6,*B0
Chapter 2, Slide 71 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Conditions Don’t Use Cross Paths
If a conditional register comes from the
opposite side, it does NOT use a
data or address cross-path.
Examples:
Chapter 2, Slide 72 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
‘C62x Data-Path Summary
CPU
Ref Guide
Chapter 2, Slide 73 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
‘C67x Data-Path Summary ‘C67x
Chapter 2, Slide 74 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Cross Paths - Summary
Data
Destination register on same side as unit.
Source registers - up to one cross path per
execute packet per side.
Use “x” to indicate cross-path.
Address
Pointer must be on same side as unit.
Data can be transferred to/from either side.
Parallel accesses: both cross or neither cross.
Chapter 2, Slide 75 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Code Review (using side A only)
40
Y = an * xn
n = 1
Note: Assume that A4 was previously cleared and the pointers are initialised.
Chapter 2, Slide 76 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Let us have a look at the final details
concerning the functional units.
Chapter 2, Slide 77 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
Operands can be:
5-bit constants (or 16-bit for MVKL and MVKH).
32-bit registers.
40-bit Registers.
However, we have seen that registers are only
32-bit.
Chapter 2, Slide 78 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
A 40-bit register can be obtained by
concatenating two registers.
However, there are 3 conditions that need
to be respected:
The registers must be from the same side.
The first register must be even and the second
odd.
The registers must be consecutive.
Chapter 2, Slide 79 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
All combinations of 40-bit registers are
shown below:
40-bit Reg 40-bit Reg
odd : even odd : even
8 32 8 32
A1:A0 B1:B0
A3:A2 B3:B2
A5:A4 B5:B4
A7:A6 B7:B6
A9:A8 B9:B8
A11:A10 B11:B10
A13:A12 B13:B12
A15:A14 B15:B14
Chapter 2, Slide 80 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
.L or .S
< dst >
32-bit 40-bit
Reg Reg
Chapter 2, Slide 81 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
.L or .S
< dst >
32-bit 40-bit
Reg Reg
Chapter 2, Slide 82 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
32-bit 40-bit
Reg Reg
Chapter 2, Slide 83 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
32-bit 40-bit
Reg Reg
Chapter 2, Slide 84 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
32-bit 40-bit
Reg Reg
Chapter 2, Slide 85 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
Chapter 2, Slide 86 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Operands - 32/40-bit Register, 5-bit Constant
instr .unit <src>, <src>, <dst>
Chapter 2, Slide 87 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Register to register data transfer
To move the content of a register (A or B)
to another register (B or A) use the move
“MV” Instruction, e.g.:
MV A0, B0
MV B6, B7
To move the content of a control register
to another register (A or B) or vice-versa
use the MVC instruction, e.g.:
MVC IFR, A0
MVC A0, IRP
Chapter 2, Slide 88 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320C6000 Instruction Set
Chapter 2, Slide 89 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C62x Instruction Set (by category)
Arithmetic Logical Data Mgmt
ABS AND
LDB/H/W
ADD CMPEQ
MV
ADDA CMPGT
MVC
ADDK CMPLT
MVK
ADD2 NOT
MVKL
MPY OR
MVKH
MPYH SHL
MVKLH
NEG SHR
STB/H/W
SMPY SSHL
SMPYH XOR
SADD Program Ctrl
SAT
SSUB Bit Mgmt B
IDLE
SUB CLR NOP
SUBA EXT
SUBC LMBD
SUB2 NORM
ZERO SET
Note: Refer to the 'C6000 CPU Reference Guide for more details.
Chapter 2, Slide 90 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C62x Instruction Set (by unit)
.S Unit .L Unit
ADD MVKLH ABS NOT
ADDK NEG ADD OR
ADD2 NOT AND SADD
AND OR CMPEQ SAT
B SET CMPGT SSUB
CLR SHL CMPLT SUB
EXT SHR LMBD SUBC
MV SSHL MV XOR
MVC SUB NEG ZERO
MVK SUB2 NORM
MVKL XOR
MVKH ZERO .D Unit
ADD STB/H/W
.M Unit ADDA SUB
MPY SMPY LDB/H/W SUBA
MPYH SMPYH MV ZERO
NEG
Other Note: Refer to the 'C6000 CPU
Reference Guide for more details.
NOP IDLE
Chapter 2, Slide 91 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
' C6700: Superset of Fixed-Point (by unit)
.S Unit .L Unit
ADD NEG ABSSP ABS NOT ADDSP
ADDK NOT ABSDP ADD OR ADDDP
.S ADD2 OR CMPGTSP AND SADD SUBSP
AND SET CMPEQSP CMPEQ SAT SUBDP
B SHL CMPLTSP CMPGT SSUB INTSP
CLR SHR CMPGTDP CMPLT SUB INTDP
EXT SSHL CMPEQDP LMBD SUBC SPINT
.L MV SUB CMPLTDP MV XOR DPINT
‘C67x
MVC SUB2 RCPSP NEG ZERO SPRTUNC
MVK XOR RCPDP NORM DPTRUNC
MVKL ZERO RSQRSP DPSP
.D MVKH RSQRDP
SPDP .M Unit
MPY SMPY MPYSP
.D Unit MPYH SMPYH MPYDP
.M ADD NEG MPYLH MPYI
ADDAB (B/H/W) STB (B/H/W) MPYHL MPYID
ADDAD SUB
LDB (B/H/W) SUBAB (B/H/W) No Unit Used
LDDW ZERO NOP IDLE
MV Note: Refer to the 'C6000 CPU
Reference Guide for more details.
Chapter 2, Slide 92 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Superset of Fixed-Point
Instruction Fetch Control Registers
Interrupt
Control
Instruction Dispatch Emulation
Advanced Instruction
Packing Advanced
Emulation
Instruction Decode
L1 S1 M1 D1 D2 M2 S2 L2
+ + X + + X + +
+ x x +
+ x + + x +
+ + x x + +
+ + x X X x + +
‘C62x: Dual 32-Bit Load/Store
‘C64x: Dual 64-Bit Load/Store
‘C67x: Dual 64-Bit Load/32-Bit Store
Chapter 2, Slide 93 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C64x: Superset of ‘C62x
Dual/Quad Arith Data Pack/Un Compares Dual/Quad Arith Data Pack/Un
.S SADD2 PACK2 CMPEQ2 .L ABS2 PACK2
SADDUS2 PACKH2 CMPEQ4 ADD2 PACKH2
SADD4 PACKLH2 CMPGT2 ADD4 PACKLH2
PACKHL2 CMPGT4 MAX PACKHL2
Bitwise Logical UNPKHU4 MIN PACKH4
ANDN UNPKLU4 Branches/PC SUB2 PACKL4
BDEC
Shifts & Merge SWAP2 BPOS
SUB4 UNPKHU4
SHR2 SPACK2 SUBABS4 UNPKLU4
SPACKU4 BNOP SWAP2/4
SHRU2 ADDKPC Bitwise Logical
SHLMB ANDN
SHRMB Multiplies
Shift & Merge MPYHI
SHLMB MPYLI
Dual Arithmetic Mem Access MPYHIR
ADD2 LDDW SHRMB
.D MPYLIR
SUB2 LDNW .M Load Constant MPY2
LDNDW MVK (5-bit) SMPY2
Bitwise Logical STDW
AND Average Bit Operations DOTP2
STNW AVG2 BITC4 DOTPN2
ANDN STNDW
OR AVG4 BITR DOTPRSU2
XOR Load Constant DEAL DOTPNRSU2
Shifts SHFL DOTPU4
MVK (5-bit) ROTL
Address Calc. DOTPSU4
ADDAD SSHVL Move GMPY4
SSHVR MVD XPND2/4
Chapter 2, Slide 94 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320C6000 Memory
Chapter 2, Slide 95 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Memory size per device
Devices Internal EMIFA EMIFB
C6201, C6701 P = 64 kB
C6204, C6205 D = 64 kB
P = 384 kB
C6203
D = 512 kB
L1P = 4 kB
C6713 128M Bytes
L1D = 4 kB N/A
L2 = 256 kB (32-bits wide)
L1P = 16 kB
C6411 128M Bytes
DM642 L1D = 16 kB N/A
L2 = 256 kB (32-bits wide)
Chapter 2, Slide 96 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Internal Memory Summary
Devices Internal External
(L2)
C6211
512M
C6711 64 kB
(32-bit wide)
C6713
512M
C6712 256 kB
(16-bit wide)
Devices Internal External
(L2)
C6414
A: 1GB (64-bit)
C6415 1 MB
B: 256kB (16-bit)
C6416
DM642 256 kB 1GB (64-bit)
C6411 256 kB 256MB (32-bit)
LINK: TMS320C6000 DSP Generation
Chapter 2, Slide 97 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
TMS320C6000 Peripherals
Chapter 2, Slide 98 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C6x System Block Diagram
Memory
External
Internal Buses
Memory P
E
R
Regs (A0-A15) .D1 .D2 I
Regs (B0-B15)
P
.M1 .M2 H
.L1 .L2 E
R
.S1 .S2 A
L
Control Regs S
CPU
Chapter 2, Slide 99 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
‘C6x Internal Buses
Program Addr x32
Internal PC
Memory Program Data x256
A
Data Addr - T1 x32 A
D regs
Data Data - T1 x32/64
External
Interface Data Addr - T2 x32 B
A Data Data - T2 x32/64 regs
x32
D
DMA Addr - Read x32
Peripherals
DMA Data - Read x32
A DMA
x32 DMA Addr - Write x32
D DMA Data - Write x32
‘C67x can perform 64-bit data loads.
Chapter 2, Slide 100 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C6x System Block Diagram
Memory
Internal Buses
P
E
EMIF R
Regs (A0-A15) .D1 .D2 I
Regs (B0-B15)
P
.M1 .M2 H
Ext’l .L1 .L2 E
Memory R
.S1 .S2 A
L
Control Regs S
CPU
Chapter 2, Slide 101 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C6x System Block Diagram
Program
RAM Data Ram
Addr
Internal Buses
D (32)
P
E
EMIF R
Regs (A0-A15) .D1 .D2 I
Regs (B0-B15)
Ext’l P
.M1 .M2 H
Memory E
.L1 .L2
- Sync R
.S1 .S2 A
- Async L
Control Regs S
CPU
Chapter 2, Slide 102 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
'C6000 Peripherals
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
(Boot) .M1 .M2
Timers
.L1 .L2
Ethernet
Video Ports .S1 .S2
VCP / TCP CPU
PLL
Chapter 2, Slide 103 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
EMIF
Async
Internal
Memory
.D1 .D2
Register Set B
Register Set A
SBSRAM .M1 .M2
.L1 .L2
External Memory Interface (EMIF).S1 .S2
Glueless access to async/sync memory CPU
Works with PC100/133 SDRAM (cheap, fast, and easy!)
Byte-wide data access
16, 32, or 64-bit bus widths
Chapter 2, Slide 104 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
HPI / XBUS / PCI
Parallel
Comm Internal
Memory
.D1 .D2
Register Set B
Register Set A
Parallel Communication Interfaces
.M1 .M2
HPI: Dedicated, slave-only, async 16/32-bit bus allows
host-P access to C6000 memory .L1 .L2
XBUS: Similar to HPI but provides …
Master/slave and sync modes
.S1 .S2
Glueless i/f to FIFOs (up to single-cycle
CPUxfer rate)
PCI: Standard 32-bit, 33MHz/66MHz PCI interface
These interfaces provide means to bootstrap the C6000
Chapter 2, Slide 105 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
GPIO
Parallel
Comm Internal
Memory
GPIO
.D1 .D2
Register Set B
Register Set A
.M1 .M2
.L1 .L2
.S1 .S2
General Purpose Input/Output (GPIO)
CPU
C64x and C6713 provide 8-16 bits of general purpose bit I/O
Use to observe or control the signal of a single-pin
Chapter 2, Slide 106 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
McBSP and Utopia
Multi-Channel Buffered Serial Port (McBSP)
Parallel
2 (or 3) full-duplex,
Comm Internal
synchronous serial-ports
Up to 100 Mb/sec performance Memory
GPIO operation (T1, E1, MVIP, …)
Supports multi-channel
Register Set B
Register Set A
Multi-Channel Audio Serial Port.M1
(McASP)
.M2
McBSP features plus more …
Up to 8 stereo lines (16 channels) .L1 .L2
IIC support
On DM642, C6713 .S1 .S2
Utopia (C64x) CPU
ATM connection
50 MHz wide area network connectivity
Chapter 2, Slide 107 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
DMA / EDMA
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
(Boot) .M1 .M2
.L1 .L2
Direct Memory Access (DMA / EDMA)
Transfers any set of memory locations to another
.S1 .S2
4 / 16 / 64 channels (transfer parameter sets)
Transfers can be triggered by any interrupt (sync)
CPU
Operates independent of CPU
On reset, provides bootstrap from memory
Chapter 2, Slide 108 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Timer/Counter
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
(Boot) .M1 .M2
Timers
.L1 .L2
.S1 .S2
Timer / Counter
Two (or three) 32-bit timer/counters CPU
Can generate interrupts
Both input and output pins
Chapter 2, Slide 109 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Ethernet MAC
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
(Boot) .M1 .M2
Timers
.L1 .L2
Ethernet
Video Ports .S1 .S2
Ethernet (DM642 only)
VCP MAC
10/100 Ethernet / TCP CPU
Pins are muxedPLL
with PCI
TCP/IP stack available from TI
Chapter 2, Slide 110 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Video Ports
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
Dual 8/10-bit BT656 or raw modes
(Boot) .M1 .M2
16/20-bit raw modes and 20-bit Y/C for high definition
Horz Scaling andTimers
Chroma Resampling Support for 8-bit modes
Supports transport interface mode .L1 .L2
Ethernet
Video Ports .S1 .S2
VCP / TCP CPU
PLL
Chapter 2, Slide 111 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
VCP / TCP -- 3G Wireless
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
DMA, EDMA
Viterbi Coprocessor (VCP) (C6416 only) .M1 .M2
(Boot)channels at 8 kbps
Supports > 500 voice
Programmable decoder parameters .L1
include .L2 length,
constraint
Timers
code rate, and frame length
Video Ports .S1 .S2
VCP / TCP CPU
PLL
Chapter 2, Slide 112 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Phase Locked Loop (PLL)
Parallel
Comm Internal
Memory
GPIO
Register Set B
Register Set A
(Boot)
DMA, EDMA CLKIN
.M1 .M2
PLL
Clock multiplierTimers
(Boot) Output.L1 .L2
Reduces EMI andEthernet
Timers
cost CLKOUT1
Rate is Pin selectable
Video Ports .S1 .S2
CLKOUT2
VCP / TCP (reduced
CPUrate clkout)
PLL
Chapter 2, Slide 113 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Clock Cycle
What is a clock cycle?
The time between successive instructions
C6000
CLKOUT1 (C6000 clock cycle)
CLKIN PLL CLKOUT2 (½, ¼, or 1/6 CLKOUT1)
When we talk
about cycles ...
Register Set B
Register Set A
(Boot) .M1 .M2
Timers
.L1 .L2
Ethernet
Timers
Video Ports .S1 .S2
VCP / TCP CPU
PLL
Chapter 2, Slide 115 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
‘C6x Family Part Numbering
Example = TMS320LC6201PKGA200
TMS320 = TI DSP
L = Place holder for voltage levels
C6 = C6x family
2 = Fixed-point core
01 = Memory/peripheral configuration
PKG = Pkg designator (actual letters TBD)
A = -40 to 85C (blank for 0 to 70C)
200 = Core CPU speed in Mhz
Chapter 2, Slide 116 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Module 1 Exam
1. Functional Units
a. How many can perform an ADD? Name them.
Chapter 2, Slide 117 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
3. Conditional Code
a. Which registers can be used as cond’l registers?
4. Performance
a. What is the 'C6711 instruction cycle time?
Chapter 2, Slide 118 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems
a. Move contents of A0-->A1
Chapter 2, Slide 119 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems
a. Move contents of A0-->A1
c. Clear register A5
Chapter 2, Slide 120 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems (cont’d)
d. A2 = A02 + A1
e. If (B1 0) then B2 = B5 * B6
f. A2 = A0 * A1 + 10
Chapter 2, Slide 121 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems (cont’d)
h. Load A7 with contents of mem1 and post-
increment the selected pointer.
Chapter 2, Slide 122 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Module 1 Exam (solution)
1. Functional Units
a. How many can perform an ADD? Name them.
six; .L1, .L2, .D1, .D2, .S1, .S2
b. Which support memory loads/stores?
.M .S .D .L
2. Memory Map
a. How many external ranges exist on ‘C6201?
Four
Chapter 2, Slide 123 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
3. Conditional Code
a. Which registers can be used as cond’l registers?
A1, A2, B0, B1, B2
b. Which instructions can be conditional?
All of them
4. Performance
a. What is the 'C6711 instruction cycle time?
CLKOUT1
b. How can the 'C6711 execute 1200 MIPs?
1200 MIPs = 8 instructions (units) x 150 MHz
Chapter 2, Slide 124 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems
a. Move contents of A0-->A1
MV .L1 A0, A1
or ADD .S1 A0, 0, A1
or MPY .M1 A0, 1, A1 (what’s the problem
with this?)
Chapter 2, Slide 125 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems
a. Move contents of A0-->A1
MV .L1 A0, A1
or ADD .S1 A0, 0, A1
(A0 can only be a
or MPY .M1 A0, 1, A1
16-bit value)
b. Move contents of CSR-->A1
MVC CSR, A1
c. Clear register A5
ZERO .S1 A5
or SUB .L1 A5, A5, A5
or MPY .M1 A5, 0, A5
or CLR .S1 A5, 0, 31, A5
or MVK .S1 0, A5
or XOR .L1 A5,A5,A5
Chapter 2, Slide 126 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
5. Coding Problems (cont’d)
d. A2 = A02 + A1
MPY.M1 A0, A0, A2
ADD.L1 A2, A1, A2
e. If (B1 0) then B2 = B5 * B6
[B1] MPY.M2 B5, B6, B2
f. A2 = A0 * A1 + 10
MPY A0, A1, A2
ADD 10, A2, A2
Chapter 2, Slide 128 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Architecture
Links:
C6711 data sheet: tms320c6711.pdf
C6713 data sheet: tms320c6713.pdf
C6416 data sheet: tms320c6416.pdf
User guide: spru189f.pdf
Errata: sprz173c.pdf
Chapter 2, Slide 129 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004
Chapter 2
TMS320C6000 Architectural
Overview
- End -