Вы находитесь на странице: 1из 13

Eng. Julian S.

Bruno

REAL TIME
Introduction
DIGITAL
Why Digital? A brief comparison with analog.
SIGNAL
PROCESSING

FI-UBA 2010 Seminario de Electrnica: Sistemas Embebidos


FI-UBA 2010 Eng. Julian S. Bruno

Advantages The BIG picture


Flexibility. Easily modifiable and upgradeable.
Reproducibility.
p y Dont depend
p on
components tolerance. Exactly reproduced
from one unit to other.
D t
Data
Reliability. No age or environmental drift.
Comple it Allows
Complexity. Allo s sophisticated applications Real time
algorithms
in only one chip.
Results

FAST
Real Time DSP System
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Sampling signals: A very important
Sampling low-pass
low pass signals (CT)
fifirst step.
The sampling theorem indicates that a continuous signal
can be properly sampled, only if it does not contain
frequency components above one-half of the sampling
rate.
t

N
LF Fc Fs

MIPS
MFLOPS

Real Time DSP System Nyquist sampling theorem fS 2 fN


FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Aliasing and frequency ambiguity Sampling band-pass


band pass signals
IF sampling
Harmonic
sampling
Sub-Nyquist
sampling
p g
Undersampling

2 fc B 2f B for any positive integer m


m,
fs c
m m 1 where fs 2B is accomplished.
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Sampling band-pass
band pass signals Reconstruction signals
m (2Fc-B)/m (2Fc-B)/(m+1) Optimum Fs

1 35.0 MHz 22.5 MHz 22.5 MHz

2 17.5 MHz 15.0 MHz 17.5 MHz

3 11.66 MHz 11.25 MHz 11.25 MHz

4 8 75 MH
8.75 MHz 9 0 MH
9.0 MHz -

5 7.0 MHz 7.5 MHz -

Optimum Fs is defined here


as that optimum
p frequency
q y
where spectral replications do
no butt up against each other
except at zero Hz
Real Time DSP System
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Reconstruction signals Reconstruction Errors


Analog signal X can be Sample and hold circuits
reconstructed from its
t kTS
X (t ) X (kTS ) sinc
samples by using the k
TS
f ll i formula:
following f l
The reconstruction is
based on the interpolation
of shifted sinc functions.
functions.
It is very difficult to generate
sinc
i functions
f ti by
b electronic
l t i
circuitry.
An approximation of a sinc
function is a pulse. Sample
and hold circuit performs The gain in the desired central band is not constant
this approximation
approximation.
The are high-frequency replica of the signal spectrum
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Reconstruction Solutions Reconstruction Errors Example
The gain in the desired central band is not constant
It is p
possible to compensate
p for this non-ideality
y byy
using an inverse filter as part of the DSP component

The are high-frequency replica of the signal spectrum


which can be removed by using a lowpass filter
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Real Time constraints Real time constraints


Algorithms time (tA) MUST fit between two
consecutive sampling periods (tS).
Thus tA limits the maximum frequency that a system
can work.
Signal Path The definition of real time is VERY application
dependant ((faster speed of evolution of the system).
y )

Real Time DSP System


FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Real time constraints DSP hardware
Block
Bl k Processing
P i Mode
M d
4 memory buffers of
length N are required
f double-buffering
for ff
method.
2 memoryy buffers (in
(
and out) are needed for
internal processing by
the processor.
A delay of 2NTs is incurred in block
processing.
More complicated programming is needed
to manage the switching between buffers.
Can be configured the ADC and DAC to
transfer data samples into the internal
memory of processor using the serial ports Real Time DSP System
and the DMA.
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

What can we do with a DSP? Linear systems implementation


Almost any linear and nonlinear system (PID
controller).
Digital filters (FIR-IIR).
Adaptive systems (LMS algorithm)
algorithm). Being x(n) and h(n) are arrays of
numbers. If we want to compute y(n)
Modulators and demodulators. we have to multiply and sum the last M
Any mathematical intensive algorithm (FFT- samples being M the length of h(n)
samples, h(n).
This repeated for every new sample
DCT-WT). received from de ADC.
As you can see, any linear system
uses multiplications, accumulations
(
(sums),
) andd lloops iintensively.
t i l

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Summary of desirable features of a
Fast Fourier Transform FFT
DSP
Fastt in
F i mathematics
th ti operations,
ti and
d
combinations of them (multiply and sum
specially).
i ll )
Flexible addressing modes (bit reversal, circular
buffers,
ff zero overhead loops))
DSP specific instruction set ((arithmetic shifting,
g
saturating arithmetic, rounding, normalization)
Minimum overhead p peripherals
p ((communications
devices specially)
DSP instructions for specific applications (Video,
Control, Audio)
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Architectural Features for Efficient


So those are DSP math features
So,
P
Programmingi
Multiply
M lti l and dAAccumulators
l t
(MACs) units.
ALUs
ALU s (fixed and floating Specialized
p addressingg modes
point). Hardware Loop Constructs
Barrel shifters.
Cacheable memories
Depending on DSP
application, more than one Multiple operations per cycle
unit
it are presentt in
i modern
d
DSPs, allowing parallelism. Interlocked pipeline
Harvard (modified) Another important features
architecture provide multiple
operations per cycle.

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient
Specialized Addressing Modes
P
Programmingi
Circular buffering Bit-Riversal
Specialized
p addressing g modes
Hardware Loop Constructs
Cacheable memories
Multiple operations per cycle
Interlocked pipeline B0 = 0x00; L0 = 44; // Base and length
B0 = 0x00; L0 = 0; // Base and length
I0=0; M0=1; // Index and increment

Another important features I0 = 0x00; M0 = 16; // Index and increment


R0 = [I0++M0]; // R0=1 & I0=0x10
I2=256; P0 = 8;
LOOP(start, end) LC0 = P0;

R0 = [I0++M0]; // R0=5 & I0=0x20 start: // I0 automatically incremented in B-R progression


R0 = [I0] || I0 += M0 (BREV);
R0 = [I0++M0]; // R0=9 & I0=0x04
end: // I2 point to bit
bit-riversed
riversed buffer
R0 = [I0++M0];
[I0 M0] // R0
R0=2 2 & I0=0x14
I0 0 14
[I2++] = R0;
R0 = [I0++M0]; // R0=6 & I0=0x24

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Architectural Features for Efficient


Hardware Loop Constructs
P
Programmingi
Looping is a critical feature in communications
Specialized
p addressingg modes processing algorithms.
Hardware Loop Constructs There are two key looping-related features that
Cacheable memories can improve
p p
performance on a wide variety y of
algorithms:
Multiple operations per cycle
zero-overhead
zero overhead hardware loop
loop
Interlocked pipeline hardware loop buffers
Another important features

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient
Cacheable memories
P
Programmingi
Todays high-speed
processors would
Specialized
p addressingg modes effectivelyy run at much
slower speeds because
Hardware Loop Constructs larger applications would
only fit in slower external
Cacheable memories memory.
Programmers would be
Multiple operations per cycle
forced to manuallyy move
key code in and out of
Interlocked pipeline internal SRAM.
Another important features Adding g data and instruction
caches into the
architecture, external
memory becomes much
more manageable. bl

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Architectural Features for Efficient


Multiple operations per cycle
P
Programmingi
In addition to
performing multiple
ALU/MAC
Specialized
p addressing
g modes operations each
core processor
Hardware Loop Constructs cycle, additional
Cacheable memories data loads and
stores can also be
Multiple operations per cycle completed in the
same cycle.
y
Interlocked pipeline The memory is
typically portioned
Another important features into sub-banks
sub banks that
can be dual-
accessed by the
core and optionally
p y
by a DMA controller.
There are two multi-issue architectures: VLIW and superscalar
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Architectural Features for Efficient
Interlocked pipeline
P
Programmingi

Specialized
p addressingg modes
Hardware Loop Constructs
Cacheable memories
Multiple operations per cycle
I order
In d tto increase
i th
throughput,
h t DSPs
DSP are d designed
i d tto
Interlocked pipeline be pipelined
Another important features When assembly programming is required
required, the pipeline
can make programming more challenging.
The processor automatically handles stalls and
bubbles.
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Architectural Features for Efficient


Another important features
P
Programmingi
RISC lik
like registers
i t and
d iinstruction
t ti sett
Specialized
p addressingg modes Multiple data/program buses.
Hardware Loop Constructs DMA controller for handling peripherals
Cacheable memories In traditional fixed
fixed-point
point DSPs, word sizes are
usually fixed. However, there is an advantage
Multiple operations per cycle
to having data registers that can be treated as:
Interlocked pipeline One 64-bit word
Another important features Two 32
32-bit
bit word
Four 16-bit word

Eight 8-bit word

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
DSP clasification Why DSP hardware?
Fixed or Floating point arithmetic.
Millions of multiplyaccumulate
py operations
p p
per
second, MMACs.
Millions of floating
floating-point
point operations per
second, MFLOPS.
Application specific feat
features
res ((video,
ideo aaudio,
dio Special-purpose (custom) chips such as application-specific
control, communications). integrated circuits (ASIC).
Field-programmable
Field programmable gate arrays (FPGA).
Memory General-purpose microprocessors or microcontrollers (P/C).
General-purpose digital signal processors (DSP processors).
DSP processors with application-specific hardware (HW)
accelerators.
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

TI Processors C5000 DSP Platform Roadmap


C5000
C6000 High
Hi h Performance
P f DSP
DSPs
Ideal for imaging, broadband infrastructure and performance audio applications.
C6000 Performance Value DSPs
C6000
Ideal for broadband infrastructure and performance audio applications. Lower
cost.
C6000 Floating-point DSPs
Ideal for professional audio products, biometrics, medical, industrial, digital
imaging, speech recognition, conference phones and voice
voice-over
over packet
C5000 Power-Efficient DSPs
Optimized for power- and cost-efficient embedded signal processing solutions
C2000 32-bit Real-time MCUs
Optimized core can run multiple complex control algorithms at speeds
necessary for
f demanding
d di control
t l applications
li ti

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
C6000 DSP Platform Roadmap
C6000 TIss ARM Processor
TI Processor-Based
Based
Sitara ARM Microprocessors
Cortex-A8 and ARM9-based embedded microprocessors
Clock Speed: 300 MHz to 1.5 GHz
3D Graphics Accelerator and Power technology (OMAP)
Stellaris MCU
ARM Cortex-M3
Cl k Speed:
Clock S d Up
U to
t 100 MHz
MH
Up to 125 MIPS (at 100 MHz)
Advanced integration: Serial interfaces, motion control, system, analog
OMAP Applications
A li ti P
Processors
ARM9-based devices. LP, general-purpose, multimedia and graphics
processing
ARM Cortex-A8
Cortex A8 core
C64x+ DSP and Video Accelerators
3-D Graphics Acceleration
DaVinci Digital Media Processors
Optimized for digital video systems
ARM9 only, ARM9 + DSP and DSP only.
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

TIss ARM Processor


TI Processor-Based
Based Products ADI Processors
TigerSHARC Processors
32-bit fixed-point as well as floating-point
Clock Speed: 250MHz to 600MHz
4.8 GMACs of 16-bit
16 bit performance / 3.6 GFLOPs
24 Mbits of on- chip memory
5 Gbytes of I/O bandwidth
SHARC Processors
32-Bit floating-point
Clock Speed: 150MHz to 400MHz / 2.4 GFLOPs.
Accelerator Architecture: FIR, IIR, FFT.
Blackfin Processors
16/32-bit fixed point
Clock Speed:
p 200MHz to 756MHz / 1.5 GMACs
Very low power consumption: 0.23mW/Mhz
RTOS supported. Multicore 600MHz / 2.4 GMACs.
ADSP-21xx Processors
16/32-bit fixed point
Clock Speed: 75MHz to 160MHz
FI-UBA 2010 Eng. Julian S. Bruno FI-UBA Analog Devices brought first programmable processor to market in 1986
2010 Eng. Julian S. Bruno
ADSP-21xx Processors Blackfin Processors
ADSP-2191 BLOCK DIAGRAM ADSP-BF536/ADSP-BF537 BLOCK DIAGRAM

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

SHARC Processors TigerSHARC


g Processor
ADSP-2146x BLOCK DIAGRAM ADSP-TS201S BLOCK DIAGRAM

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno
Markets and Applications Recommended bibliography
RG LLyons, U
Understanding
d t di Di Digital
it l Si
Signall P
Processing
i
2nd ed. Prentice Hall 2004.
Ch2: Periodic Sampling
SW Smith, The Scientist and Engineers guide to
DSP. California Tech. Pub. 1997.
Ch1: The Breadth and Depth of DSP
Ch3: ADC and DAC
SM Kuo,
K BH L
Lee. R
Real-Time
l Ti Digital
Di it l Si
Signall P
Processing
i
2nd ed. John Wiley and Sons. 2006
Ch1:Introduction to Real
Real-Time
Time Digital Signal Processing

NOTE: Many images used in this presentation were extracted from the
recommended bibliography.

FI-UBA 2010 Eng. Julian S. Bruno FI-UBA 2010 Eng. Julian S. Bruno

Questions?

Thank you!
Eng. Julian S. Bruno FI-UBA 2010