Вы находитесь на странице: 1из 25

Tulsiramji Gaikwad-Patil College of Engineering & Technology, Nagpur

Department of Electronics & Communication Engineering

Solution (Win-2017)
Subject: DSP Processor & Architecture
Sem: VIIth

Que 1(a). Explain why a MAC operation is implemented in hardware in programmable DSPs.

Ans: Most common operation required in DSP applications .i.e. Array multiplication (or Sum of product
Operation (SOP)) E.g. Convolution and Correlation require array multiplication. One of the
important requirement of array multipliers is that they have to process the signal in real time before thenext
sample of the input signal arrives at the input the array, the array multiplication should be completed. This
requires multiplication as well as accumulation to be carried out using hardware elements. There are
two approaches to solve this problem are:
1) Implement a Dedicated MAC unit in H/W implement a Dedicated MAC unit in H/W
2) Have Multiplier and Accumulator separate
In both the approaches the MAC operation can be completed in one clock cycle. The
Presence of H/W multiplier and/or MAC is one of the mandatory requirement of P‐DSPs.

Fig. Implementation of convolver with single multiplier/ adder.

Page | 1
Que 2(a). Explain the difference between Von Neumann and Harvard architecture for the computer.
Which architecture is preferred for DSP application and why?
Ans: Conventional microprocessors use Von Neumann architecture for memory management wherein the
same memory is used to store both the program and data (Fig 1.) Although this architecture is simple, it
takes more number of processor cycles for the execution of a single instruction as the same bus is used for
both data and program.

In order to increase the speed of operation, separate memories were used to store program and data and a
separate set of data and address buses have been given to both memories, the architecture called as Harvard
Architecture.

Although the usage of separate memories for data and the instruction speeds up the processing, it
will not completely solve the problem. As many of the DSP instructions require more than one operand, use
of a single data memory leads to the fetch the operands one after the other, thus increasing the delay of
processing. This problem can be overcome by using two separate data memories for storing operands
separately, thus in a single clock cycle both the operands can be fetched together.

Although the above architecture improves the speed of operation, it requires more hardware and
interconnections, thus increasing the cost and complexity of the system. Therefore there should be a trade
off between the cost and speed while selecting memory architecture for a DSP.

Page | 2
Que 4(a). List status registers bits of C5X and their functions.
Ans:-
The status registers can be stored into data memory and loaded from data memory, thereby allowing
the ’C5x status to be saved and restored for subroutines. The LST instruction writes to ST0 and ST1, and the
SST instruction reads from them, except that the ARP bits and INTM bit are not affected by the LST #0
instruction. The ST0 and ST1 each have an associated 1-level deep shadow register stack automatic context-
saving when an interrupt trap is taken. The registers are automatically restored upon a return from interrupt
(RETI) or return from interrupt with interrupt enable (RETE) instruction. Note that the INTM bit in ST0 and
the XF bit in ST1 are not saved on the stack or restored from the stack on an automatic context save. This
feature allows the XF pin to be toggled in an interrupt service routine and also allows automatic context
saves.

Fig –status register 0 (ST0) bit assignment


 ARP(Auxiliary register pointer):- Auxiliary register pointer. These bits select the auxiliary register
(AR) to be used in indirect addressing. When the ARP is loaded, the previous ARP value is copied to
the auxiliary register buffer (ARB) in ST1. The ARP can be modified by memory-reference
instructions when you use indirect addressing, and by the MAR or LST #0 instruction. When an LST
#1 instruction is executed, the ARP is loaded with the same value as the ARB.

 OV (Overflow flag bit):- This bit indicates that an arithmetic operation overflow in the arithmetic
logic unit (ALU). The OV bit can be modified by the LST #0 instruction.
OV = 0 , Overflow did not occur in the ALU. The OV bit is cleared by a reset or a
conditional branch (BCND/BCNDD on OV/NOV).
OV = 1, Overflow does occur in the ALU. As a latched overflow signal, the OV bit remains
set.
 OVM (Overflow mode bit):- This bit enables/disables the accumulator overflow saturation
mode in the arithmetic logic unit (ALU). The OVM bit can be modified by the LST #0
instruction.
OVM = 0 Disabled. An overflowed result is loaded into the accumulator without modification. The
OVM bit can be cleared by the CLRC OVM instruction.
OVM = 1 Overflow saturation mode. An overflowed result is loaded into the accumulator with
either the most positive (00 7FFF FFFFh) or the most negative value (FF 8000 0000h). The OVM bit can be
set by the SETC OVM instruction.

Page | 3
 INTM (Interrupt mode bit):- This bit globally masks or enables all interrupts. The INTM bit has
no effect on the nonmaskable RS and NMI interrupts. Note that the INTM bit is unaffected by the TRAP and
LST #0 instructions. The INTM bit is not saved on the stack or restored from the stack on an automatic
context save during interrupt service routines.

INTM = 0 All unmaskable interrupts are enabled. The INTM bit can be cleared by the CLRC INTM or
RETE instruction.

INTM = 1 All maskable interrupts are disabled. The INTM bit can be set by the SETC INTM or INTR
instruction, a RS and IACK signal, or when a maskable interrupt trap is taken.

 DP (Data memory page pointer bits):- These bits specify the address of the current data memory
page. The DP bits are concatenated with the 7 LSBs of an instruction word to form a direct memory
address of 16 bits. The DP bits can be modified by the LST #0 or LDP instruction.

Fig –status register 1 (ST1) bit assignment

 ARB (Auxiliary register buffer):- This 3-bit field holds the previous value contained in the
Auxiliary register pointer (ARP) in ST0. Whenever the ARP is loaded, the previous ARP value is copied to
the ARB, except when using the LST #0 instruction. When the ARB is loaded using the LST #1 instruction,
the same value is also copied to the ARP. This is useful when restoring context (when not using the
automatic context save) in a subroutine that modifies the current ARP.

 CNF (On-chip RAM configuration control bit):- This 1-bit field enables the on-chip dual-access
RAM block 0 (DARAM B0) to be addressable in data memory space or program memory space. The CNF
bit can be modified by the LST #1 instruction.
CNF = 0 The on-chip DARAM block 0 is mapped into data memory space. The CNF bit can be cleared by a
reset or the CLRC CNF instruction.
CNF = 1 The on-chip DARAM block 0 is mapped into program memory space. The CNF bit can be set by
the SETC CNF instruction.
 TC (Test/control flag bit):- This 1-bit flag stores the results of the arithmetic logic unit (ALU)

Page | 4
or parallel logic unit (PLU) test bit operations. The TC bit is affected by the APL, BIT, BITT, CMPR, CPL,
NORM, OPL, and XPL instructions. The status of the TC bit determines if the conditional branch, call, and
return instructions execute. The TC bit can be modified by the LST #1 instruction.
TC = 0 The TC bit can be cleared by the CLRC TC instruction or any one of the following events:
 The result of the logical operation is 1 when tested by the APL, OPL, or XPL instructions.
 A bit tested by the BIT or BITT instruction is equal to 0.
 A compare condition is false when tested by the CMPR or CPL instruction.
 The result of the exclusive-OR operation is false when tested by the NORM instruction.

TC = 1 The TC bit can be set by the SETC TC instruction or any one of the following events:
 The result of the logical operation is 0 when tested by the APL, OPL, or XPL instructions.
 A bit tested by the BIT or BITT instruction is equal to 1.
 A compare condition is true when tested by the CMPR or CPL instruction.
 The result of the exclusive-OR operation is true when tested by the NORM instruction.

 SXM (Sign-extension mode bit):-This 1-bit field enables/disables sign extension of an arithmetic
operation. The SXM bit does not affect the operations of certain arithmetic or logical instructions; the
ADDC, ADDS, SUBB, or SUBS instruction suppresses sign extension, regardless of SXM. The SXM bit
can be modified by the LST #1 instruction.
SXM = 0 Sign extension is suppressed. The SXM bit can be cleared by the CLRC SXM instruction.
SXM = 1 Sign extension is produced on data as the data is passed into the accumulator through the scaling
shifter. The SXM bit can be set by a reset or the SETC SXM instruction.

 C (Carry bit):- This 1-bit field indicates an arithmetic operation carry or borrow in the arithmetic
logic unit (ALU). The single-bit shift and rotate instructions affect the C bit. The C bit can be
modified by the LST #1 instruction.
C = 0 The result of a subtraction generates a borrow or the result of an addition (except ADD with a
16-bit shift instruction) did not generate a carry. The ADD with a 16-bit shift instruction can only set
the bit (by a carry operation); otherwise, the bit is unaffected. The C bit can be cleared by the CLRC
C instruction.
C = 1 The result of an addition generates a carry or the result of a subtraction (except SUB with a
16-bit shift instruction) did not generate a borrow. The SUB with a 16-bit shift instruction can only
clear the bit (by a borrow operation); otherwise, the bit is unaffected. The C bit can be set by a reset
or the SETC C instruction.

Page | 5
 HM (Hold mode bit):- This 1-bit field determines whether the central processing unit (CPU) stops
or continues execution when acknowledging an active HOLD signal. The HM bit can be modified by
the LST #1 instruction.
HM = 0 The CPU continues execution from on-chip program memory but puts its external interface
in the high-impedance state. The HM bit can be cleared by the CLRC HM instruction.
 HM = 1 The CPU halts internal execution. The HM bit can be set by a reset or the SETC HM
instruction.
 XF pin status bit. This 1-bit field determines the level of the external flag (XF) output pin. The XF
bit can be modified by the LST #1 instruction. The XF bit is not saved or restored from the stack on
an automatic context save during interrupt service routines.
XF = 0 The XF output pin is set to a logic low. The XF bit can be cleared by the CLRC XF
instruction.
XF = 1 The XF output pin is set to a logic high. The XF bit can be set by a reset or the SETC XF
instruction.

 PM (Product shift mode bits):- This 2-bit field determines the product shifter (P-SCALER) mode
and shift value for the product register (PREG) output into the arithmetic logic unit (ALU). The PM
bits can be set by the SPM or LST #1 instruction. See Table for the product shifter modes. The PM
shifts also occur when the PREG contents are stored to data memory. The PREG contents remain
unchanged during the shifts.

Que.5(b). Explain pipelining operation with reference to the C5X for instructions with single word and
two words.
Ans:-
a) Pipeline Operation of 1-Word Instruction
ADD *+
SAMM TREG0
MPY *+
SQRA*+,AR2

Page | 6
Assume memory locations 60h = 10h, 61h = 3h, and 62h = 6h. The following is the condition of the pipeline
for each cycle.
Cycle 1: F) Fetch the ADD instruction and update PC to next instruction.
Cycle 2: F) Fetch the SAMM instruction and update PC.
D) Decode the ADD instruction, generate address, and update AR6.
Cycle 3: F) Fetch the MPY instruction and update PC.
D) Decode the SAMM instruction, no address generate, and no ARAU update.
R) Read data from memory location 60h (10h) which is the location pointed at by AR6 before the
update of cycle 2.
Cycle 4: F) Fetch the SQRA instruction and update PC.
D) Decode the MPY instruction and update AR6.
R) No operand read for the SAMM instruction.
E) Add data read in cycle 3 (10h) to data in ACC (20h) and store result in ACC (ACC = 30h).
Cycle 5: F) Fetch the next instruction and update PC.
D) Decode the SQRA instruction, and update AR6 and ARP.
R) Read data from data memory location 61h (3h) which is the location pointed at by AR6 before
the update of cycle 4.
E) Store data in ACC to TREG0 (TREG0 = 30h).

Cycle 6: F) Fetch the next instruction and update PC.


D) Decode the instruction fetched in cycle 5.
R) Read data from data memory location 62h (6h) which is the location pointed at by AR6 before
the update of cycle 5.
E) Multiply data in TREG0 (30h) with data read in cycle 5 (3h) and store result in PREG (PREG = 90h).
Cycle 7: F) Fetch the next instruction and update PC.
D) Decode the instruction fetched in cycle 6.
R) Depends on the instruction fetched in cycle 5.

Page | 7
E) Add data in ACC (30h) to data in PREG (90h) and store result in ACC (ACC = C0h). Store data
read in cycle 6 (6h) to TREG0. Square data in TREG0 (6h) and store result in PREG (PREG = 24h).

Que 6(a). Describe the Block diagram of DSP starter Kit (DSK).

Ans:-

Fig.-a) C5X DSK Block Diagram

Above figure shows the interconnections, which include the host interface, analog interface, and
emulation interface. PC communications are via the RS-232 port on the DSK board. The 32K bytes of
PROM contain the kernel program for boot loading. All pins of the ’C50 are connected to the external I/O
interfaces. The external I/O interfaces include four 24-pin headers, a 4-pin header, and a 14-pinXDS510
header. The TLC32040 AIC interfaces to the ’C50 serial port. Two RCA connectors provide analog input
and output on the board.
Features of the TMS320C5x DSK
 Industry standard ’C50 fixed-point DSP
 50 ns instruction cycle time
 32K-byte PROM (programmable read-only memory)
 Voice quality analog data acquisition via the TLC32040 AIC (analog interface circuit)
 Standard RCA connectors for analog input and output that provide direct connection to microphone
and speaker
 XDS510 emulator connector
 I/O expansion bus for external design

What are the addresses of the program memory address space and data memory space in the on-chip
memory of C50 in the DSP starter kit where user programs and data may be stored?
Ans:-
Page | 8
Fig. b) Memory map for C5X DSK

The ’C5x DSK is one of the simplest ’C5x DSP application boards. Even though no external
memory is available on the board, the 10K on-chip RAM of the ’C50 provides enough memory for most
DSP application programs. The kernel program is contained in the 32K, 8-bit PROM. The PROM is only for
DSK boot loading and cannot be accessed after boot loading, as this portion of the on-chip memory is
reserved for the kernel program.
The on-chip, dual-access, random-access-memory (DARAM) B2 is reserved as a buffer for the status
registers. The single-access, random-access-memory (SARAM) is configured as program and data memory.
The kernel program is stored in this area from 0x840h–0x980h. If the kernel program performs an overwrite,
a reset signal is required to let the DSK reload the kernel program. Since the kernel program is stored in the
SARAM, this on-chip memory cannot be configured as data memory only (RAM = 0). The interrupt vectors
are allocated, starting from 0x800h. The IPTR in the PMST register should not be modified.

Que 8. Write short notes on on-chip memory of TMS320C54X DSP’s.


Ans:
The `54X memory is organized into three individually selectable spaces: program, data and I/O
spaces. All `C54X device contain both random access memory (RAM) and read only memory (ROM).
Among the devices, two types of RAM are represented: Dual access RAM (DARAM) and single access
RAM (SARAM). The DARAM and SARAM may be configured either as data memory or program/data
Page | 9
memory. Below table shows the how much ROM, DARAM, SARAM are available on the different `54X
devices. The `54X device also has 26 CPU register plus peripheral registers that are mapped in data
memory space.

Memory type `541 `542,`543 ‘545,’546 ‘548 ‘549 ‘5402 ‘5410 ‘5420
ROM 28k 2k 48k 2k 16k 4k 16k 0
Program 20k 2k 32k 2k 16k 4k 16k 0
Program/data 8k 0 16k 0 16k 4k 0 0
DARAM 5k 10k 6k 8k 8k 16k 8k 32k
SARAM 0 0 0 24k 24k 0 56k 168k

 On-chip ROM
On chip ROM is part of program memory space and in some cases, part of the data memory space. The
amount of on-chip ROM available on each device varies as shown in above table. On devices with
amount of ROM(2K words), the ROM contains the boot loader, which is useful for booting to faster on
chip or external RAM. On devices with larger amounts of ROM, a Portion of the ROM may be mapped
into both data and program memory space.

 On-chip Dual access RAM (DARAM)


The DARAM composed of several blocks. Because each DARAM block can be accessed twice per
machine cycle, The CPU can read from and write to a single block of DARAM in the same cycle. The
DARAM is always mapped in data space and is primarily intended to store data values. It can also be
mapped into program space and used to store program code.

 On-chip Single access RAM (SARAM)


The SARAM composed of several blocks. Each block is accessible once per machine cycle for either
read or write. The SARAM is always mapped in data space and is primarily intended to store data
values. It can also be mapped into program space and used to store program code.

 On-chip memory security


The `54x maskable memory security option protects the contents of on chip memories. When this
option is chosen, no externally originating instruction can access the on chip memory spaces.

 Memory mapped registers


The data memory space contains memory mapped registers for the CPU and the on-chip peripherals.
These registers are located on data page 0, simplifying access to them. The memory-mapped access

Page | 10
provides a convenient way to save and restore the registers for context switches and to transfer
information between the accumulator and the other registers.

Que. 7(b). Explain the bus structure of TMS320C54X Processor.


Ans:-
The ’54x device architecture is built around eight major 16-bit buses:
 One program-read bus (PB) which carries the instruction code and immediate operands from
program memory
 Two data-read buses (CB, DB) and one data-write bus (EB), which interconnect to various elements,
such as the CPU, data-address generation logic (DAGEN), program-address generation logic
(PAGEN), on-chip peripherals, and data memory
 The CB and DB carry the operands read from data memory.
 The EB carries the data to be written to memory.
 Four address buses (PAB, CAB, DAB, and EAB), which carry the addresses needed for instruction
execution
 The ’54x devices have the capability to generate up to two data-memory addresses per cycle, which
are stored into two auxiliary register arithmetic units (ARAU0 and ARAU1).
 The PB can carry data operands stored in program space (for instance, a coefficient table) to the
multiplier for multiply /accumulate operations or to a destination in data space for the data-move
instruction.
 The ’54x devices also have an on-chip bidirectional bus for accessing on-chip peripherals; this bus is
connected to DB and EB through the bus exchanger in the CPU interface.

Que 8(a). Explain phases of pipelining of TMS320C54X processor.

Page | 11
Que 8(b). What do you mean by addressing modes? Explain it with suitable example for
TMS320C54X Processor.

Page | 12
OR
Explain any four addressing modes of TMS320C54X processor.
Ans:-
The addressing mode refers to a method of specifying the operand or the data to be operated by the
instruction.
The TMS320C54xE DSP offers seven basic addressing modes:
 Immediate addressing uses the instruction to encode a fixed value.
 Absolute addressing uses the instruction to encode a fixed address.
 Accumulator addressing uses an accumulator to access a location in program memory as data.
 Direct addressing uses seven bits of the instruction to encode an offset relative to DP or to SP. The
offset plus DP or SP determine the actual address in data memory.
 Indirect addressing uses the auxiliary registers to access memory.
 Memory-mapped register addressing modifies the memory-mapped registers without affecting
either the current DP value or the current SP value.
 Stack addressing manages adding and removing items from the system stack.

 Immediate addressing mode


1. In the immediate addressing mode the data is specified as part of the instruction.
2. The instruction can carry 3 bit/5 bit/8 bit/9 bit 16 bit constant
3. The immediate constant is specified by the # symbol.
Example: LD #12Ah , DP : Load the immediate 9 bit constant (12Ah) in DP field of status register 0

 Absolute addressing mode


1. In the absolute addressing 16 bit address of the operand is directly specified in the instruction.
2. This addressing can be used to address an operand which is directly specified in the instruction.
3. This addressing can be used to address an operand in all the three address spaces of processor.
4. The 16 bit address is specified as a 16 bit constant without # symbol.
Example: MVKD 5F38h,*AR2 : Move the data from memory address by the immediate address
=5F38h to another data memory location address by auxiliary register AR2.
 Accumulator addressing mode
In the accumulator addressing the content of accumulator is the address of the operand/data in
program memory.
Example: READA *AR3 : Read the content of program memory address by accumulator A and
store in data memory address by AR3.
 Direct addressing mode

Page | 13
1. In the direct addressing mode the lower 7 bits of data memory address are specified in instruction
itself. The 16 bit data memory address is formed by using either the 9 bits of DP(data pointer) in
status register ST0 or the 16 bit SP (stack pointer).
2. When the DP is used the 9 bit of DP is the 9 upper bits of the 16 bit address and the lower 7 bits are
the address directly specified by the instruction.
3. When SP is used the 16 bit constant of SP is added to 7 bits specified in the instruction to form 16 bit
address.
Example : ADD 6h, A : Add the memory directly addressed the instruction to the accumulator A.

 Indirect addressing mode

In the indirect addressing mode the data memory address is specified by the content of one of
the eight auxiliary registers, AR0-AR7 the AR currently used for accessing the data denoted by 3 bit
ARP. The content of AR can be updated automatically either after or before the operand is fetched.

 Memory-mapped register addressing

1. Memory-mapped register addressing is used to modify the memory-mapped registers without


affecting either the current data-page pointer (DP) value or the current stack-pointer (SP) value.
Because DP and SP do not need to be modified in this mode, the overhead for writing to a
register is minimal.
2. Memory-mapped register addressing works for both direct and indirect addressing. In this mode,
the addresses are generated by forcing the nine most significant bits (MSBs) of data-memory
address to 0, regardless of the current value of DP or SP when direct addressing is Used. When
indirect addressing is used, the seven LSBs of the current auxiliary register value when indirect
addressing is used.
3. In addition to registers, any scratch-pad RAM located on data page 0 can be modified by using
memory-mapped register addressing. Only eight instructions can use memory-mapped register
addressing
LDM MMR, dst
MVDM dmad, MMR
MVMD MMR, dmad
MVMM MMRx, MMRy
POPM MMR
PSHM MMR
STLM src, MMR
STM #lk, MMR
 Stack addressing mode

Page | 14
1. The system stack is used to automatically store the program counter during interrupts and
subroutines.
2. The stack is filled from the highest to the lowest memory address. The processor uses a 16-bit
memory mapped register, the stack pointer (SP), to address the stack.
3. SP always points to the last element stored onto the stack.Four instructions access the stack using
the stack addressing mode:
 PSHD pushes a data-memory value onto the stack.
 PSHM pushes a memory-mapped register onto the stack.
 POPD pops a data-memory value from the stack.
 POPM pops a memory-mapped register from the stack.

Que 9(b). Write short notes on Code composer studio (CCS).

Ans:

The DSP Processor is supported by a set of software development tools, which includes an
optimizing C/C++ compiler, an assembler, a linker, and assorted utilities as shown below.

The shaded portion of the figure highlights the most common path of software development for
C/C++ language programs. The other portions are peripheral functions that enhance the development
process. The functions of different sections are described below.

Page | 15
 The C/C++ compiler accepts the C/C++ source code and produces C54x assembly language source
code. An optimizer and an inter list feature are parts of the compiler.
 The optimizer modifies code to improve the efficiency of C/C++ programs.
 The inter list feature interweaves C/C++ source statements with assembly language
output.
 The assembler translates assembly language source files into machine language
object files. The machine language is based on common object file format (COFF).
 The linker combines object files into a single executable object module. As it creates
the executable module, it performs relocation and resolves external references. The
linker accepts re locatable COFF object files and object libraries as input.
 The archiver collect a group of files to make a single archive file, called a library.
The archiver also allows modifying a library by deleting, replacing, extracting, or
adding members.
 The mnemonic-to-algebraic translator utility converts assembly language source
files. The utility accepts an assembly language source file containing mnemonic
instructions and then it converts the mnemonic instructions into algebraic instructions,
producing an assembly language source file containing algebraic instructions.
 The runtime-support libraries contain the ISO standard runtime-support functions,
compiler-utility functions, floating-point arithmetic functions, and C I/O functions,
supported by the C54x compiler.
 The C54x debugger accepts executable COFF files as input. The hex conversion
utility converts a COFF object file into TI-Tagged, ASCII-hex, Intel, or Tektronix
object format. The converted file can be downloaded into an EPROM programmer.
 The absolute lister accepts linked object files as input and creates .abs files as output.
Without the absolute lister, it would be tedious producing such a listing and require
many manual operations.
 The cross-reference lister uses object files to produce a cross-reference listing
showing symbols, their definitions, and their references in the linked source files.

Que 9(a). List out the salient features of TMS320C6X Processor.

Ans:
 The C6000 devices execute up to eight 32‐bit instructions per cycle.
 The C6x CPU consists of 32 general‐purpose 32‐bit registers and eight functional units.
 These eight functional units contain:
 Two multipliers
 Six ALUs
 The C6000 generation has
Page | 16
 A complete set of optimized development tools,
 An efficient C compiler, an assembly optimizer for simplified assembly‐language
programming and scheduling
 Windows based debugger interface for visibility into source code execution
characteristics.
 Advanced VLIW CPU with eight functional units, including two multipliers and six arithmetic units
 Executes up to eight instructions per cycle
 Allows designers to develop highly effective RISC‐like code for fast development time

 Instruction packing
 Gives code size equivalence for eight instructions executed serially or in parallel
 Reduces code size, program fetches, and power consumption

 Conditional execution of all instructions


 Reduces costly branching
 Increases parallelism for higher sustained performance

 Efficient code execution on independent functional units


 efficient C compiler
 assembly optimizer for fast development and improved parallelization

 8/16/32‐bit data support, providing efficient memory support for a variety of applications.
 Hardware support for single‐precision (32‐bit) and double‐precision (64‐bit) IEEE floating point
operations.
 32 × 32‐bit integer multiply with 32‐bit or 64‐bit result.

Que 10(b). Give the brief introduction of Motorola DSP563XX Processor.


Ans:

Page | 17
The combination of powerful instruction set, multiple internal buses, DMA channels, on-chip
program and data memories, external buses, standard peripherals, and power management of the DSP56300
family make it an excellent solution for wireless or wireline DSP applications from individual subscriber to
infrastructure, as well as multimedia and high-end audio applications, including videoconferencing.
Core Overview
 One Million Instructions Per Second (MIPS) per MHz of operating speed
 Object code compatible with the DSP56000 core
 Highly parallel instruction set
 Data Arithmetic Logic Unit (Data ALU)
 Address Generation Unit (AGU)
 Program Control Unit (PCU)
 On-chip instruction cache controller
 External memory interface (Port A)
 Phase Locked Loop (PLL)
 Hardware debugging support (JTAG TAP, OnCETM module, and Address Trace mode)
 Six-channel Direct Memory Access (DMA) controller
 Reduced power dissipation
o Very low power CMOS design
o Wait and Stop low-power standby modes
o Fully-static logic
 Data Arithmetic Logic Unit (Data ALU)
 The Data ALU performs all the arithmetic and logical operations on data operands in the
 DSP56300 core. The components of the Data ALU are as follows:
 Fully pipelined 24 ´ 24-bit parallel Multiplier-Accumulator (MAC) unit
 Bit Field Unit, comprising a 56-bit parallel barrel shifter (fast shift and
Page | 18
 normalization; bit stream generation and parsing)
 Conditional ALU instructions
 24-bit or 16-bit arithmetic support under software control
 Four 24-bit input general purpose registers: X1, X0, Y1, and Y0
 Six Data ALU registers (A2, A1, A0, B2, B1, and B0) that are concatenated into
 two general purpose 56-bit accumulators and accumulator shifters (A and B)
 Two data bus shifter/limiter circuits

 Address Generation Unit (AGU)


The Address Generation Unit (AGU) performs the effective address calculations for
addressing data operands in memory and contains the integer arithmetic and registers used
to generate the addresses. The AGU operates in parallel with the other core resource, and
so minimizes address-generation overhead of instruction sequences. It implements four
types of address arithmetic:
 Linear
 Modulo
 Multiple wrap-around modulo
 Reverse-carry
 Program Control Unit (PCU)
The Program Control Unit (PCU) performs instruction fetch, instruction decoding,
hardware DO loop control, and exception processing. The PCU implements a seven-stage
pipeline and controls the different processing states of the DSP56300 core. The PCU consists
of three hardware blocks:
 Program Decode Controller (PDC): Decodes the 24-bit instruction loaded into
the instruction latch and generates all necessary pipeline control signals
 Program Address Generator (PAG): Contains the hardware for program address
generation, system stack, and loop control
 Program Interrupt Controller (PIC): Arbitrates among all interrupt requests
(internal interrupts and the five external requests IRQA, IRQB, IRQC, IRQD, and
NMI),and generates the appropriate interrupt vector address
 Position independent code (PIC) support
 Addressing modes optimized for DSP applications (including immediate
offsets)

Page | 19
 On-chip instruction cache controller
 On-chip memory-expandable hardware stack
 Nested hardware DO loops
 Fast auto-return interrupts
 Program Address Trace mode support

 On-chip Instruction Cache


The instruction cache functions as a buffer memory between external memory and the DSP
core processor. When code executes, the code words at the locations requested by the instruction set are
copied into the instruction cache for direct access by the core processor. If the same code is used frequently
in a set of program instructions, storage of these instructions in the cache yields an increase in throughput,
because external bus accesses are eliminated. the instruction cache has 1024 24-bit words (1 K words)
of instruction cache memory, with the following features:
 Software controlled Cache Enable (CE) bit in the Extended Mode Register (EMR)
in the Status Register (SR)
 Instruction cache size of 1024 24-bit words
 Eight-way, fully associative instruction cache with sectored placement policy
 1- to 4-word transfer granularity
 Least recently used (LRU) sector replacement algorithm
 Transparent operation (that is, no user management is required)
 Individual sector locking/unlocking
 Global cache flush controlled by software
 Cache controller status observable via the JTAG/OnCE port
 Phase Locked Loop (PLL) and Clock Generator
The clock generator in the DSP56300 core is composed of two main blocks:
 Phase Locked Loop (PLL): Clock-input division, frequency multiplication, and skew
elimination
 Clock Generator (CLKGEN): Low-power division and clock pulse generation and
change of low-power Divide Factor (DF) without loss of lock

The PLL allows the processor to operate at a high internal clock frequency using a
low frequency clock input, a feature that offers two immediate benefits:
Page | 20
 A lower frequency clock input reduces the overall electromagnetic interference
generated by a system.
 The ability to oscillate at different frequencies reduces costs by eliminating the need
to add additional oscillators to a system.

 Direct Memory Access (DMA)


The Direct Memory Access (DMA) block permits data transfers without the interaction of
the core. It supports any combination of internal memory, internal peripheral I/O and external
memory as source and destination during accesses. The DMA block has the following features:
 Six DMA channels supporting internal and external accesses
 One-, two-, and three-dimensional transfers (including circular buffering)
 End-of-block-transfer interrupts
 Triggering from interrupt lines and all peripherals

 Introduction to Digital Signal Processing


Digital signal processing is the arithmetic processing of real-time signals that are sampled
at regular intervals and digitized. Examples of digital signal processing include the
following:
 Filtering
 Convolution (mixing two signals)
 Correlation (comparing two signals)
 Rectification, amplification, and/or transformation

Que.10(b). Write the main features that contribute to this high throughput of DSP563XX
Processor.
Ans:
The main features that contribute to this high throughput include the following:
 Speed: The DSP56300 family supports most high-performance DSP applications.
 Precision: The data paths are 24 bits wide, providing 144 dB of dynamic range; intermediate results
held in the 56-bit accumulators can range over 336 dB.
 Parallelism: Each on-chip execution unit, memory, and peripheral operates independently and in
parallel with the other units through a sophisticated bus system. The Data ALU, AGU, and program
controller operate in parallel so that the following can execute in a single instruction:
 An instruction pre-fetch
 A 24-bit24-bit multiplication
 A 54-bit addition
 Two data moves
 Two address-pointer updates using either linear or modulo arithmetic

Page | 21
 Flexibility: While many other DSPs need external communications circuitry to interface with
peripheral circuits (such as A/D converters, D/A converters, or host processors), the DSP56300
family provides on-chip serial and parallel interfaces that can support various configurations of
memory and peripheral modules. The peripherals are interfaced to the DSP56300 family core
through a peripheral interface bus that provides a common interface to many different peripherals.
 Sophisticated Debugging: Motorola’s On-Chip Emulation (OnCE) technology allows simple,
inexpensive, and speed independent access to the internal registers for debugging. With the OnCE
module, you can determine easily the exact status of the registers and memory locations and what
instructions were last executed.
 Phase Locked Loop (PLL)-Based Clocking: The PLL allows the chip to use almost any available
external system clock for full-speed operation, while also supplying an output clock synchronized to
a synthesized internal core clock. It improves the synchronous timing of the external memory port,
eliminating the timing skew common on other processors.
 Invisible Pipeline: The seven-stage instruction pipeline is essentially invisible to the programmer,
allowing straightforward program development in either assembly language or high-level languages
such as C or C++.
 Instruction Set: The instruction mnemonics are similar to those used for microcontroller units,
making the transition from programming microprocessors to programming the chip as easy as
possible. New microcontroller instructions, addressing modes, and bit field instructions allow for
significant decreases in program code size. The orthogonal syntax controls the parallel execution
units. The hardware DO loop instruction and the repeat (REP) instruction make writing straight-line
code obsolete.
 Low Power: Designed in CMOS, the DSP56300 family consumes very little power. Two additional
low-power modes, Stop and Wait, further reduce power requirements. Wait is a low-power mode in
which the DSP56300 family core is shut down, but the peripherals and interrupt controller continue
to operate so that an interrupt can bring the chip out of Wait mode. In Stop mode, even more of the
circuitry is shut down for the lowest power consumption. Several different ways exist to bring the
chip out of Stop mode: hardware RESET, IRQA, and DE.

Que 11(a). Draw and explain the working interpolation filter.


Ans:
An interpolation filter is used to increase the sampling rate. The Interpolation process involve
inserting samples to create additional samples between incoming samples to create additional samples to
increase the sampling rate for the output.

Page | 22
One way to implement the interpolation filter is to first insert zeros between samples of original
sample sequence. The zero inserted sequence is then passed through an appropriate low pass digital FIR
filter to generate the interpolated sequence. The interpolation process is shown in fig below.
This type of interpolation is called linear interpolation because the convolution sequence h(n) is
derived base on linear interpolation of samples.
The h(n) selected is just a second order filter and therefore uses just two adjacent samples to
interpolate a sample.
A higher order filter can be used to base interpolation on more input samples. To implement an ideal
interpolation, the filter based on samples of an appropriate Sinc function can be used.

Insert (L-1 ) Low pass


X(n) zeros filter y(m)

Sampling frequency fs

L fs L fs

Fig. digital interpolation with interpolation factor


Que 12(b). Draw and explain the working Wavelet filter.
Ans:
Wavelet Filtering. The Wavelet Filter command allows you to selectively emphasize or de-
emphasize image details in a certain spatial frequency domain. ... A Wavelet transform is
similar to a Fast Fourier Transform (FFT), in that it breaks a signal or image down into
frequency components.

Page | 23
Page | 24
Page | 25

Вам также может понравиться