Vlsidsp Chap1 PDF

VLSI Digital Signal Processing Systems
Introduction to Digital Signal

Processing Systems
Lan-Da Van (), Ph. D.
Department of Computer Science
National Chiao Tung University
Taiwan, R.O.C.
Fall, 2010
ldvan@cs.nctu.edu.tw
http://www.cs.nctu.edu.tw/~ldvan/
Outlines
Introduction
DSP Algorithms
DSP Applications and CMOS ICs
Representations of DSP Algorithms
Conclusion
References
Lan-Da Van
VLSI-DSP-1-2
Why Use Digital Signal Processing?

Robust to temperature and process variations
Controlled better to accuracy
Noise/interference tolerances
Mathematical representation
Programming capability
Lan-Da Van
VLSI-DSP-1-3
Common System Configuration
Multimedia-Communication Applications
VLSI Signal
Processing
Library
Processor
Lan-Da Van
Software
VLSI-DSP-1-4
VLSI Signal Processing System

Design Spectrum (1/2)
Computer arithmetic
Processor
Adder
Multiplier
Inverse square root
Division
Digital filter
3D Graphics
Multidimensional filter
Symmetry filter
Adaptive digital filter
LMS/DLMS (Delay LMS) based

RLS based
Transform
General purposed processor

DSP processor
Reconfigurable computing
processor
Ear-Aid System
Multiplier-accumulator based
Recursive-filter based
ROM-based: DA, CORDIC
Butterfly based
Geometry transformation
Rasterization/Rendering
Z-buffer compression
Texture compression
Adaptive algorithm
Filter bank
System Security
Lan-Da Van
VLSI-DSP-1-5

Design Spectrum (2/2)
MIMO Detection
Grouped Detection
VBLAST
K-Best
Biomedical Computation
ICA
PCA
HRV
ADC
SAR ADC
Pipeline ADC
Sigma-Delta
PLL
Image Processing
Pattern Recognition
Median Filter
Image Reconstruction
Image Projection
Video Processing
Compression
Block Matching
Deblocking filter
Non-numerical operation
Lan-Da Van
Error control coding

Viterbi Decoder
Turbo Code
Polynomial computation
Dynamic programmable
VLSI-DSP-1-6

Publication Area (But not limited)
IEEE Trans. on Biomedical Engineering

IEEE Trans. on Circuits and Systems I: Regular Papers
IEEE Trans. on Circuits and Systems II: Express Briefs
IEEE Trans. on Circuits and Systems for Video Technology
IEEE Trans. on Communications
IEEE Trans. on Computer-Aided Design of Integrated Circuits
IEEE Trans. on Computers
IEEE Trans. on Image Processing
IEEE Trans. on Information Theory
IEEE Trans. on Multimedia
IEEE Trans. on Neural Networks
IEEE Journal on Selected Areas in Communications
IEEE Trans. on Signal Processing
IEEE Journal of Solid-State Circuits
IEEE Trans. on VLSI Systems
IEEE Trans. on Visualization and Computer Graphics
Proceedings of the IEEE
ACM Trans. on Graphics
Journal of Signal Processing Systems
IEICE Transactions on Fundamentals of Electronics, Communications and Computer
Sciences
Elsevier Integration - The VLSI Journal
Lan-Da Van
VLSI-DSP-1-7

Design Space
Cost
Performance
System Level
Area
Test
Algorithm Level
Architecture Level
Logic Level
Circuit Level
Power
Process Level
Lan-Da Van
VLSI-DSP-1-8
Outlines
Features:
DSP Algorithms
Lan-Da Van
VLSI-DSP-1-9
DSP Algorithms
Convolution
Algorithm: A set of rules for solving a
Correlation
problem in a finite number of steps.
Digital filters
Adaptive filters
Discrete Fourier transform
Source Coding Algorithms
Discrete cosine transform

Motion estimation
Huffman coding
Vector quantization
Decimator and expander

Wavelet and filter banks
Viterbi algorithm and dynamic programming
Lan-Da Van
VLSI-DSP-1-10
Signals
Analog signal
t->y: y=f(t), y:C, t:C
Discrete-time signal
n->y: y=f(nT), y:C, n:Z
Digital signal
n->y: y=D{f(nT)}, y:Z,n:Z
( 3)
( 2)
(1) (1)
Analog Signal
n
Discrete-Time Signal
Lan-Da Van
(1110) 2
(1011) 2 (1000) 2
(1000) 2
Digital Signal
VLSI-DSP-1-11
LTI Systems
Linear systems
Assume x1(n)->y1(n) and x2(n)->y2(n), where -> denotes

lead to. If ax1(n)+bx2(n)->ay1(n)+by2(n), then the systems is
referred to as Linear System.
Homogenous and additive properties
Time-invariant (TI) systems
x(n-n0)->y(n-n0)
LTI systems
y(n)=h(n)*x(n)
Causal systems
y(n0) depends only on x(n), where n<=n0
Stable systems
BIBO
Lan-Da Van
VLSI-DSP-1-12
Sampling of Analog Signals

Nyquist sampling theorem
The analog signal must be band-limited

Sample rate must be larger than twice the bandwidth
Lan-Da Van
VLSI-DSP-1-13
System-Equation Representation
Impulse/unit sample response
h(n) b0 a1nu[n]
Transfer function / frequency response
b0
Y ( z)
H ( z)
X ( z ) 1 a1 z 1
Difference equations
y (n) a1 y (n 1) b0 x(n)
State equations
Lan-Da Van
VLSI-DSP-1-14
Convolution & Correlation

Convolution
y (n ) x(n ) h(n )
x ( k )h( n k )
Correlation
h(k )x(n k )
y (n)
a (k ) x(n k )
a ( k ) x(n k ) a ( n ) x(n )
Lan-Da Van
VLSI-DSP-1-15
Linear Phase FIR Digital Filters

Digital filters are an important
class of LTI systems.
Linear phase FIR filter
h(n) h( M n)
h(0) h(6) b0
h(1) h(5) b1
h(2) h(4) b2
h(3) b3
y(n) b0 x(n) b1 x(n 1) b2 x(n 2) b3 x(n 3)

b0 x(n 6) b1 x(n 5) b2 x(n 4)
Lan-Da Van
VLSI-DSP-1-16
IIR Filter Structures

1
b0 b1 z b2 z
H ( z)
1
2
1 a1 z a2 z
x (n )
b0
b1
b2
y (n )
z 1
a1
z 1
a2
Lan-Da Van
VLSI-DSP-1-17
Introduction to an Adaptive
Algorithm
Widely used in communications, DSP, and control

system
Deterministic gradient / least square algorithm
Steepest descent algorithm

RLS algorithm
Stochastic gradient algorithm
LMS algorithm, DLMS algorithm

Block LMS algorithm
Gradient Lattice algorithm
Lan-Da Van
VLSI-DSP-1-18
Adaptive Applications
Channel equalizer
System identification
Echo canceller
Noise cancellation
Predictor
Line enhancement
Beamformer
Image enhancement
Lan-Da Van
VLSI-DSP-1-19
Notation
1.Input Signal: X(n)

2.Desired Ou tput: d(n)
3.Weight Vector: W(n)
4. Adaptation Factor:
5.Error: e(n)
6.Misadjustm ent: M adj

7.Tap Number : N
8. Autocorrel ation Matrix: R
9.Eigenvalue:
10.Diagonal Matrix :
Lan-Da Van
VLSI-DSP-1-20
Steepest Descent Algorithm
y(n) W (n) X (n)

T
where X (n) [ x(n) x(n 1) ... x(n N 1)]
W (n) [w0 (n) w1 (n) ... wN 1 (n)]
The error at the n-th time is
e( n ) d ( n ) y ( n )
d (n) W (n) X (n)
T
d (n) X (n)W (n)

T
Lan-Da Van
VLSI-DSP-1-21
LMS Algorithm
An efficient implementation in software of steepest
descent using measured or estimated gradients
The gradient of the square of a single error sample
1
J (n))
W (n 1) W (n) (
2
J (n) 2e(n) X (n)
W(n 1 ) W(n) e(n)X(n)
Cost
w0, w1, w0
Lan-Da Van
VLSI-DSP-1-22
Summary of LMS Adaptive

Algorithm (1960)
y ( n) w ( n) x( n)
T
e(n) d (n) y(n)

w(n 1) w(n) e(n)x(n)
Lan-Da Van
VLSI-DSP-1-23
Block Diagram of an Adaptive FIR

Filter Driven by the LMS Algorithm
x ( n 2)
x ( n 1)
z1
x (n )
w0 ( n )
z1
w1 ( n )
x ( n N 1)
z1
w2 ( n )
wN 1 ( n )
y (n )
Lan-Da Van
d (n )
e(n )
VLSI-DSP-1-24
Unitary/Orthogonal Transform (1/4)

Definition: (from Linear Algebra)
Let A be n n matrix that satisfies
AA A A I .
We call A as an unitary matrix if A

has complex entries, and we call A as
an orthogonal matrix if A has real
number.
Lan-Da Van
VLSI-DSP-1-25
Why Orthogonal Transformation?

(2/4)
Energy conservation
Energy compaction
Most unitary transforms tend to pack a large fraction of the

average energy of signals into a relatively few components
of the transform coefficients.
Decorrelation
When signals are highly correlated, the transform

coefficients tend to be uncorrected (or less correlated).
Information preservation
The information carried by signals are preserved under a

unitary transform.
Lan-Da Van
VLSI-DSP-1-26

(3/4)
The original signal

250
200
150
100
50
100
200
300
400
The DCT coefficients
500
600
100
200
500
600
4000
2000
0
-2000
300
400
Source: Lecture of Prof. Dennis Deng

Lan-Da Van
VLSI-DSP-1-27

(4/4)
The auto correlation of original signal

1
0.98
0
5
10
15
20
The auto correlation of DCT coefficients
25
1
0.5
0
-0.5
10
15
20
25
Source: Lecture of Prof. Dennis Deng

Lan-Da Van
VLSI-DSP-1-28
Discrete Fourier Transform (1/9)

DFT
N 1
X (k ) x(n )WNnk , n 0,1,..., N 1

n 0
WN e
2
N
, WNnk e
2
nk
N
IDFT
1
x ( n)
N
N 1
X (k )W N nk , n 0,1,..., N 1
k 0
Lan-Da Van
VLSI-DSP-1-29
Fast Fourier Transform (2/9)

The radix-2 algorithm is the most widely used fast
algorithm to compute the DFT.
Without loss of generality, we use an 8-point DFT (N=8)
to illustrate the development of the fast algorithm.
7
X (k )
x(n)WNkn
n0
x(0) x(2)WN2 k x(4)WN4 k x(6)WN6 k

x(1)WNk x(3)WN3k x(5)WN5k x(7)WN7 k
x(0) x(2)WN2 k x(4)WN4 k x(6)WN6 k
WNk ( x(1) x(3)WN2 k x(5)WN4 k x(7)WN6 k )
Lan-Da Van
VLSI-DSP-1-30

Since
WN2k
2
j 2k
e N
2
k
( N / 2)
WNk / 2
8-point DFT => Nearly two 4-point DFT

X (k ) x(0) x(2)WNk / 2 x(4)WN2k/ 2 x(6)WN3k/ 2
WNk ( x(1) x(3)WNk / 2 x(5)W N2k/ 2 x(7)WN3k/ 2 )

F1 (k ) WNk F2 (k ),
wherek 0,1,2,...N 1
where F1 (k )and F2 (k ) represent the DFT of two sequences

f1 (n) x(2n)and f 2 (n) x(2n 1)
Lan-Da Van
VLSI-DSP-1-31

One step further: (=>Two 4-point DFT)
WNk N / 2 e
2
( k N / 2)
N
2
k
N e j
WNk
and WNk / 2N / 2 WNk / 2
X (k ) F1 (k ) WNk F2 (k ), k 0,1,...,N / 2 1
X (k N / 2) F1 (k ) WNk F2 (k ), k 0,1,...,N / 2 1
An N-point DFT requires N2 complex multiplications. The
number of complex multiplications required by the above
algorithm is as follows.
2( N / 2) 2 N 40 ( N 8)
An 8-point DFT requires 64 complex multiplications.
Lan-Da Van
VLSI-DSP-1-32

The 4-point DFT can be decomposed into two 2-point
DFT in a similar way.
F1 (k ) x(0) x(4)WN2k/ 2 x(2)WNk / 2 x(6)WN3k/ 2
x(0) x(4)WNk / 4 WNk / 2 ( x(2) x(6)WNk / 4 )
V11 (k ) WNk / 2V12 (k )
where V11(k) and V12(k) represent the DFT of two
sequences.
As before,
v11 (n) f1 (2n) and v12 (n) f1 (2n 1)
F1 (k ) V11 (k ) WNk / 2V12 (k ),k 0,1,..., N / 4 1

F1 (k N / 2) V11 (k ) WNk / 2V12 (k ),k 0,1,...,N / 4 1
Lan-Da Van
VLSI-DSP-1-33
Lan-Da Van
VLSI-DSP-1-34

A 2-point FFT, such as V11 (k )andV12 (k ) involves only
real addition
V11 (k ) x(0) W 2k x(4),W20 1,W21 1

V11 ( 0) x ( 0) W N0 x ( 4)
V11 (1) x ( 0) W N0 x ( 4)
a
WN
B
-1
Each butterfly requires one complex multiplication

and two complex addition
Lan-Da Van
VLSI-DSP-1-35

After decimation, the sequence is in a bit-reversed
order
original order
0
1
2
3
4
5
6
7
n2n1n0
000
001
010
011
100
101
110
111
decimation1
0
2
4
6
1
3
5
7
n0n2n1
000
010
100
110
001
011
101
111
Lan-Da Van
decimation 2
0
4
2
6
1
5
3
7
n0n1n2
000
100
010
110
001
101
011
111
VLSI-DSP-1-36

This FFT algorithm is generally true for any data
v
sequence of N 2
There are N/2 butterflies per stage and log 2 N
stages
The number of operations required for an FFT:
(Before simplifying)
Complex multiplication: N log 2 N
Complex addition: N log 2 N
Lan-Da Van
VLSI-DSP-1-37
Image/Video Compression
Where coding?
Source coding
Channel coding
Source coding benefits
Lower bit rate

Less transmission time
Fewer storage data
What kind of loss?
Lossless data compression

Lossy data compression
Why can we do compression?
Coding redundancy
Inter-sample redundancy (Spatial redundancy)
Inter-frame redundancy (Temporal redundancy)
Lan-Da Van
VLSI-DSP-1-38
Source Coding Spectrum

Image Compression
Lossless
Loss
Huffman Coding
Predictive Coding
Shannon Coding
Transform Coding
ArithmeticCoding
VQ Coding
Subband Coding
Lan-Da Van
VLSI-DSP-1-39
Image Measurement and Evaluation
SNR(dB)
2
10 log10 ( x
2
/ n )
PSNR(dB) 10 log10 (255
1
N2
2
/ n )
[ x(i, j) x(i, j)]
i 1 j 1
where x(i,j) and x(i,j) denote the orignal image

and reconstructed image valules, respectively.
Lan-Da Van
VLSI-DSP-1-40
Discrete Cosine Transform (DCTII)
(2n 1)k
X (k ) (k ) x(n ) cos[
] , 0 k N-1
2N
n 0
N 1
( 0)
1
N
(k )
Lan-Da Van
2
, for 1 k N 1
N
VLSI-DSP-1-41
2-D DCT-II and IDCT-II

DCT-II
N 1 N 1
2
(2m 1)k
(2n 1)l
Z (k , l ) (k ) (l )
x(m, n) cos(
) cos(
)
N
2N
2N
m 0 n 0
Z AXA T
IDCT-II
2
x(m, n)
N
N 1 N 1
(2m 1)k
(2n 1)l
(k ) (l ) Z (k , l ) cos(
) cos(
)
2N
2N
k 0 l 0
X A ZA
where k, l, m, and n ranges from 0 to N-1 and
T
( 0 ) 1/ 2 and (j) 1 for j 0.

Lan-Da Van
VLSI-DSP-1-42
How to Decide the Coefficients?

Orthogonal Property
AA A A I
T
Parsevals Theorem: Energy Conservation
N 1
n 0
1
x(n)
N
2
N 1
X (k )
k 0
Lan-Da Van
VLSI-DSP-1-43
2-D DCT/IDCT Processor
1D DCT/
IDCT
Unit
Transpose
Memory
1D DCT/
IDCT
Unit
(a)
MUX
2:1
1D DCT/
IDCT
Unit
DMUX
1:2
(b)
Lan-Da Van
Transpose
Memory
VLSI-DSP-1-44
Block-Matching Algorithm
N 1 N 1
Rule:
s(m, n ) x(i, j ) y (i m, j n )
i 0 j 0
u min( m,n ){s(m, n)}

v (m, n ) u
for p m, n p
for p m, n p
Lan-Da Van
VLSI-DSP-1-45
Huffman Coding (1/3)

Entropy
Information Measurement
Uncertainty Measurement
Surprise Measurement
H( P)
p log
i
2 (1 /
pi )
i 1
Compression Ratio
uncoding bits
Cr
coding bits
Lan-Da Van
VLSI-DSP-1-46

Input
Probability
x1
1
2
x2
x3
1
1
1
4
1
8
1
16
x5
1
64
1
64
1
64
x7
x8
1
64
x4
x6
1
0
0
0
Lan-Da Van
VLSI-DSP-1-47

AvLen Natural_Code 3 bit
uncoding bits
Cr
coding bits
3
1.5
2
H ( x ) Entropy
Data
x1
x2
Huffman Code
Natural Code
000
01
001
x3
x4
001
010
0001
011
x5
000001
100
x6
000000
101
x7
000011
110
x8
000010
111
1
1
1
1
1
log2 2 log2 4 log2 8 log216 (4xlog2 64) 2 bit
2
4
8
16
64
AvLen Huffman_Co de
1
1
1
1
1
x1 x2 x3 x4 (4x6) 2 bit
2
4
8
16
64
Lan-Da Van
VLSI-DSP-1-48
Vector Quantization
k 1
d ( x, y ) x y ( xi yi )2
2
i 0
Lan-Da Van
VLSI-DSP-1-49
Outlines
Features:
DSP Algorithms
Lan-Da Van
VLSI-DSP-1-50
Moores Law
Microns
Tr. # (Complexity)
Tr. # (Productivity)
100M
10G
10
Device Complexity
1G
Gate Length
10M
58% / year
1M
100M
Gap
Increases
10M
100K
10K
1M
x
0.1
x
100K
x
x
x
10K
0.01
21%/ year
1K
x x
Design Productivity
100
10
1K
1980
1985
1990
1995
2000
2005
2010
Source: Sematech
The number of transistors per chip doubles every 18 months.

* Cordon Moore: One of the founders of Intel
Lan-Da Van
VLSI-DSP-1-51
Common DSP Algorithms and Their

Applications
Lan-Da Van
VLSI-DSP-1-52
Evolution of Applications
Lan-Da Van
VLSI-DSP-1-53
Chronological Table of Video Coding

Standards
ITU-T
VCEG
H.261
(1990)
ISO/IEC
MPEG
1990
H.263
H.263++
(1995/96) H.263+
(2000)
(1997/98)
H.264
MPEG-2
( MPEG-4
(H.262)
Part 10 )
(1994/95)
MPEG-4 v1
(2002)
(1998/99)
MPEG-4 v2
(1999/00)
MPEG-1
MPEG-4 v3
(1993)
(2001)
1992
1994
1996
Lan-Da Van
1998
2000
2002
2003
VLSI-DSP-1-54
Block Diagram of MPEG-2 Encoder
Lan-Da Van
VLSI-DSP-1-55
MPEG-2 / H.262: High Bit Rate,

High Quality
MPEG-2 contains 10 parts
MPEG-2 Visual = H.262
Not especially useful below 2 Mbps (range of use
normally 2-20 Mbps)
Applications: SDTV (2-5Mbps), DVD (6-8Mbps),
HDTV (20Mbps), VOD
Support for interlaced scan pictures
PSNR, temporal, and spatial scalability
Profile and Level
10-bit precision video sampling
Lan-Da Van
VLSI-DSP-1-56
Position of H.264
Lan-Da Van
VLSI-DSP-1-57
Block Diagram of H.264/AVC

Encoder
Input
Video
Signal
Coder
Control
Transform/
Scal./Quant.
Split into
Macroblocks
16x16 pixels
Control
Data
Decoder
Quant.
Transf. coeffs
Scaling & Inv.
Transform
Entropy
Coding
Intra-frame
Prediction
Intra/Inter
De-blocking
Filter
Output
Video
Signal
MotionCompensation
Motion
Data
Motion
Estimation
Lan-Da Van
VLSI-DSP-1-58
New Features of H.264

Multi-mode, multi-reference MC
Motion vector can point out of image border
1/4-, 1/8-pixel motion vector precision
B-frame prediction weighting
44 integer transform
Multi-mode intra-prediction
In-loop de-blocking filter
UVLC (Uniform Variable Length Coding)
NAL (Network Abstraction Layer)
SP-slices
Lan-Da Van
VLSI-DSP-1-59
3D Graphics System
Geometry Engine
Raster Engine
Lan-Da Van
VLSI-DSP-1-60
Shading Algorithms (1/2)

Gouraud shading
Per-vertex lighting
Low computation
Not good shading quality
Phong shading
Per-pixel lighting
Huge computation
Smooth and more realistic highlight
Lan-Da Van
VLSI-DSP-1-61
Shading Algorithms (2/2)

Existing Approximate Phong Shading Algorithms
Taylor expansion based approximate algorithms

Spherical interpolation based approximate algorithms
Quadratic interpolation based approximate algorithms
Mixed shading
Subdivision based approximate algorithms
N(t)
NA
No pass
NB
Pass
Spherical interpolation
Quadratic interpolation
Source: ACM/IEEE
Mixed shading
62
Lan-Da Van
Subdivision
2010/10/5
VLSI-DSP-1-62
Four Area Networks

From small to big:
Personal area network

Local area network
Metro area network
Wide area network
Corresponding IEEE
standard in each area
network
WiMAX
http://tech.digitimes.com.tw/ShowNews.aspx?zCatId=134&zNotesDocId=E88A9E150386245D48256F5B00128CAC
Lan-Da Van
VLSI-DSP-1-63
High Mobility
Communication Standards Evolution

(1/4)
GSM/GPRS
WCDMA
WAN
3GPP
LTE
HSPA
WiMAX
802.16e
WiMAX
802.16m
MAN
WiMAX
802.16d
Low Mobility
802.11b
802.11a/g
ZigBee
802.15.4
Bluetooth
802.15.1
WiMedia
802.15.3
a
RFID
0.1 Mbps
1 Mbps
802.11n
10 Mbps
Low data rate
100 Mbps
LAN
PAN
1000 Mbps
High data rate

Source: UMTS Forum
Lan-Da Van
VLSI-DSP-1-64

(2/4)
Lan-Da Van
VLSI-DSP-1-65

(3/4)
Lan-Da Van
VLSI-DSP-1-66

(4/4)
Lan-Da Van
VLSI-DSP-1-67
Comparison of LTE and WiMax
Lan-Da Van
VLSI-DSP-1-68
Comparison of HSDPA and WiMax

HSDPA
WiMAX / 802.16e
Architecture
Circuit switched, evolved to

packet on data downlink
Packet Oriented
Spectrum
Licensed
Licensed/Unlicensed
Frequency Bands
Below 2.7 GHz
2-11 GHz
Channel Conditions
NLOS
NLOS
Bandwidth
5 MHz
1.75 to 20 MHz
Symmetric/Asymmetric
Asymmetric
Symmetric
Moving Speed Allowed
Mobile (up to 250 km/h)
Portable (up to 100 km/h)
Multiple Access
TDMA+CDMA
FDMA+TDMA
Modulation
CDMA with SF=16

QPSK, 16QAM
OFDMA with 128 to 2048 FFT

QPSK, 16QAM, 64QAM
Channel Coding
Turbo code
Convolutional code
Bit Rate
Up to 14.4 Mbps in 5 MHz
Up to 15 Mbps in 5 MHz
Roaming
Global
Local / regional
Source: Chunghwa Telcom Co. Ltd.

Lan-Da Van
VLSI-DSP-1-69
Comparisons of Various
Cellular Standards
Lan-Da Van
VLSI-DSP-1-70
Digital Communications System

Enabling the transmitted signal to withstand the effects of
various channel impairments, such as noise, interference,
and fading.
Information
Source
Source
Encoder
Encrypter
Error Control
Encoder
Modulator
Channel
Information
Sink
Source
Decoder
Decrypter
Lan-Da Van
Error Control
Decoder
Demodula
tor
VLSI-DSP-1-71
Multiple Access Techniques
Source: IEEE SpectrumLan-Da Van
VLSI-DSP-1-72
ODFM System
Source: Prof. Wen, NCCU.

Lan-Da Van
VLSI-DSP-1-73
Outlines
Features:
DSP Algorithms
Block Diagrams
Signal-Flow Graph
Data-Flow Graph
Dependence Graph
Lan-Da Van
VLSI-DSP-1-74
Block Diagram of a 3-Tap FIR Filter

Def: A block
diagram consists of
functional blocks
connected with
directed edges.
Lan-Da Van
VLSI-DSP-1-75
SFG of a 3-Tap FIR Filter

Def: A signal flow graph
(SGF) is a collection of
nodes and directed edges.
The nodes represent
computations or tasks. In
digital networks, the
edges are usually
restricted to constant gain
multipliers or delay
elements.
Lan-Da Van
VLSI-DSP-1-76
DFG of a 3-Tap FIR Filter

Def: A data flow graph
(DFG) is a collection of
nodes and directed edges.
The nodes represent
computations (or functions
or subtasks) and the
directed edges represent
data path and each edge
has a nonnegative number
of delays associated with it.
Lan-Da Van
VLSI-DSP-1-77
DG of a 3-Tap FIR Filter

Def: A dependence graph is
a direct graph that shows
the dependence of the
computations in an
algorithm. The node in a
DG represent computations
and the edges represent
precedence constraints
among nodes. DG contains
computations for all
iterations in an algorithm
and does not contain delay
elements.
Lan-Da Van
VLSI-DSP-1-78
Conclusions
Briefly introduced the following:
DSP design issue and design view

DSP algorithms
Overview of DSP applications
Representations of DSP algorithms
Lan-Da Van
VLSI-DSP-1-79
References (1/4)
[1] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. NY: Wiley, 1999.
[2] P. Pirsch, Architectures for Digital Signal Processing. NY: Wiley, 1998.
[3] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1989.
[4] S. Haykin, Adaptive Filter Theory, 3rd ed. Englewood Cliffs, NJ: Prentice-Hall, 1996.
[5] ,, 1992.
[6]. P. Y. Chen, L. D. Van, I. H. Khoo, H. C. Reddy, C. T. Lin, "Power-efficient and cost-effective 2-D symmetry filter
architectures," accepted and to appear in IEEE Trans. Circuits Syst. I, in press, 2010. (SCI & EI, Full Paper)
[7] D. Y.. Wu and L. D. Van, "Efficient detection algorithms for MIMO communication systems," to appear in Journal of Signal
Processing Systems, 2010. (SCI & EI, Full Paper)
[8] J. H. Tu and L. D. Van, "Power-efficient pipelined reconfigurable fixed-width Baugh-Wooley multipliers," to appear in IEEE
Trans. Computers, vol. 58, no. 10, pp. 1346-1355, Oct. 2009. (SCI & EI, Full Paper)
[9]. C. T. Lin, Y. C. Yu, and L. D. Van, "Cost-effective triple-mode reconfigurable pipeline FFT/IFFT/2-D DCT processor," IEEE
Trans. VLSI Syst., vol. 16, no. 8, pp. 1058-1071, Aug. 2008. (SCI & EI, Full Paper)
[10] L. D. Van, C. T. Lin, and Y. C. Yu, VLSI architecture for the low-computation cycle and power-efficient recursive DFT/IDFT
design, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E90-A, no. 8, pp.
1644-1652, Aug. 2007. (SCI & EI, Full Paper)
[11]. M. A. Song, L. D. Van, and S. Y. Kuo, Adaptive low-error fixed-width Booth multipliers, IEICE Transactions on
Fundamentals of Electronics, Communications and Computer Sciences, vol. E90-A, no. 6, pp. 1180-1187, Jun. 2007. (SCI & EI,
Full Paper)
[12]. L. D. Van and C. C. Yang, Generalized low-error area-efficient fixed-width multipliers, IEEE Trans. Circuits Syst. I, vol. 52,
pp. 1608-1619, Aug. 2005. (SCI & EI, Full Paper)
[13]. L. D. Van, "A new 2-D systolic digital filter architecture without global broadcast," IEEE Trans. VLSI Syst., vol. 10, pp.
477-486, Aug. 2002. (SCI & EI, Full Paper)
[14]. L. D. Van and W. S. Feng, "An efficient systolic architecture for the DLMS adaptive filter and its applications," IEEE Trans.
Circuits Syst. II, vol. 48, pp. 359-366, April 2001. (SCI & EI, Full Paper)
[15]. L. D. Van, S. S. Wang, and W. S. Feng, "Design of the lower-error fixed-width multiplier and its application", IEEE Trans.
Circuits Syst. II, vol. 47, pp. 1112-1118, Oct. 2000. (SCI & EI, Brief)
Lan-Da Van
VLSI-DSP-1-80
References (2/4)
032. T. Y. Sheu, L. D. Van, T. R. Jung, C. W. Lin, and T. W. Chang, "Low complexity subdivision algorithm to approximate Phong
shading using forward difference," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2008, pp. 2373-2376, Taipei, Taiwan.
031. P. Y. Chen, L. D. Van, and H. C. Reddy and C. T. Lin, "A new VLSI 2-D fourfold-rotational-symmetry filter architecture
design," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2008, pp. 93-96, Taipei, Taiwan.
030. I. H. Khoo, H. C. Reddy, L. D. Van, and C. T. Lin, "2-D digital filter architectures without global broadcast and some
symmetry applications," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2008, pp. 952-955, Taipei, Taiwan.
029. L. Y. Lin, H. K. Lin, C. Y. Wang, L. D. Van, and J. Y. Jou, "Hierarchical architecture for network-on-chip platform, in Proc.
VLSI-DAT, 2009, Apr. 2009, pp. 343-346, Hsinchu, Taiwan.
028. W. C. Huang, S. H. Hung, J. F. Chung, L. D. Van, and C. T. Lin, "FPGA implementation of 4-Channel ICA for on-line EEG
signal separation," in Proc. IEEE Int. Biomedical Circuits Syst. Conference (BioCAS), Nov. 2008, accepted, Baltimore, USA.
027. P. Y. Chen, L. D. Van, and H. C. Reddy and C. T. Lin, "A new VLSI 2-D diagonal-symmetry filter architecture design," in Proc.
IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), Nov. 2008, accepted, Macao, China.
026. D. Y. Wu, L. D. Van, "A grouped-iterative framework for MIMO detection," in Proc. IEEE Vehicle Technology Conference
(VTC), Sep. 2008, accepted, Calgary, Canada.
025. T. R. Jung, L. D. Van, T. Y. Sheu, C. W. Lin, W. C. Fang, "Design of multi-mode depth buffer compression for 3D graphics
system," in Proc. IEEE Int. Conf. Multimedia and Expo. (ICME), Jun. 2008, accepted, Hannover, Germany.
024. T. R. Jung, L. D. Van, W. C. Fang, T. Y. Sheu, "Reconfigurable depth buffer compression design for 3D graphics system," in
Proc. Int. Conf. MUE, Apr. 2008, pp. 470-474, Busan, Korea. (IEEE CS Sponsor)
023. C. W.. Hsueh, J. F. Chung, L. D. Van, C. T. Lin, "Anticipatory access pipeline design for phased cache," in Proc. IEEE Int.
Symp. Circuits Syst. (ISCAS), May 2008, accepted, Seattle, USA.
022. C. C. Huang, S. H. Hung, J. F. Chung, L. D. Van, C. T. Lin, "Front-end amplifier of low-noise and tunable BW/Gain for
portable biomedical signal acquisition," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2008, accepted, Seattle, USA.
Lan-Da Van
VLSI-DSP-1-81
References (3/4)
021. C. T. Lin, L. W. Ko, B. C. Kuo, K. L. Lin, S. F. Liang, I. F. Chung, L. D. Van, "Classification of driver's cognitive
responses using nonparametric single-trial EEG analysis," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2007, pp.
2019-2023, New Orleans, USA.
020. L. D. Van, H. F. Luo, N. S. Chang, C. M. Huang, "A cost-effective reconfigurable accelerator for platform-based SOC
design," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2006, pp. 1977-1980, Greece.
019. C. T. Lin, Y. C. Yu, L. D. Van, "A low power 64-point FFT/IFFT design for IEEE 802.11a WLAN application," in Proc.
IEEE Int. Symp. Circuits Syst. (ISCAS), May 2006, pp. 4523-4526, Greece.
018. C. M. Huang, K. J. Lee, C. C. Yang, W. S. Hu, S. S. Wang, J. B. Chen, C. S. Chen, L. D. Van, C. M. Wu, W. C. Tsai, J.
Y. Jou,, "Multi-Project System-on-Chip (MP-SoC): A novel test vehicle for SoC silicon prototyping," in Proc. IEEE Int. SOC
Conf. (SOCC), Sep. 2006, pp. 137-140, Texas, USA. (Invited Paper, Rate=8/169)
017. L. D. Van, Y. C. Yu, C. M. Huang, C. T. Lin, "Low computation cycle and high speed recursive DFT/IDFT: VLSI
algorithm and architecture," in Proc. IEEE Workshop on Signal Processing Systems (SiPS), Nov. 2005, pp. 579-584, Athens,
Greece.
016. M. A. Song, L. D. Van, C. C. Yang, S. C. Chiu, S. Y. Kuo, "A framework for the design of error-aware power-efficient
fixed-width Booth multipliers," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2005, pp. 81-84, Kobe, Japan.
015. Y. C. Fan, L. D. Van, C. M. Huang, H. W. Tsao, "Hardware-efficient architecture design of wavelet-based adaptive
visible watermarking," in Proc. IEEE Int. Symp. Consume Electronics (ISCE), June 2005, pp. 399-403, Macau.
014. H. Y. Chao, J. S. Wang, C. M. Wu, C. M. Huang, L. D. Van, "High-performance low-complexity bit-plane coding scheme
for MPEG-4 FGS," to appear in Proc. IEEE Int. Conf. Multimedia and Expo. (ICME), July 2005, pp. 89-92, Amsterdam,
Netherlands.
013. C. A. Tsai, Y. T. Chou, Y. T. Chang, L. D. Van, C. M. Huang, "ARM-Based SoC Prototyping Platform Using Aptix," to
appear in ICEER2005, Tainan, Taiwan. (Best Poster Paper Award)
012. L. D. Van, H. F. Luo, C. M. Wu, W. S. Hu, C. M. Huang, and W. C. Tsai, "A high-performance area-aware DSP
processor architecture for video codecs," in Proc. IEEE Int. Conf. Multimedia and Expo. (ICME), Jun. 2004, vol. 3, pp. 14991502, Taipei, Taiwan.
011. M. A. Song, L. D. Van, T. C. Huang, and S. Y. Kuo, "A generalized methodology for low-error and area-time efficient
fixed-width Booth multipliers", IEEE Int. Midwest Symp. Circuits Syst. (MWSCAS), July 2004, vol. 1, pp. 9-12, Japan. (Best
Student Paper Nomination)
010. L. D. Van and C. C. Yang, "High-speed area-efficient recursive DFT/IDFT architectures," in Proc. IEEE Int. Symp.
Circuits Syst. (ISCAS), May 2004, vol. 3, pp. 357-360, Vancouver , Canada.
009. M. A. Song, L. D. Van, T. C. Huang and S. Y. Kuo, "A low-error and area-time efficient fixed-width Booth multiplier," to
appear in Proc. IEEE Int. Midwest Symp. Circuits Syst. (MWSCAS), Dec. 2003, vol. 2, pp. 590-593, Egypt.
Lan-Da Van
VLSI-DSP-1-82
References (4/4)
008. L. D. Van and C. H. Chang, "Pipelined RLS adaptive architecture using relaxed Givens rotations (RGR)," in Proc. IEEE
Int. Symp. Circuits Syst. (ISCAS), May 2002, vol. 1, pp. 37-40, Phoenix , Arizona.
007. L. D. Van and S. H. Lee, "A generalized methodology for lower-error area-efficient fixed-width multipliers," in Proc. IEEE
Int. Symp. Circuits Syst. (ISCAS), May 2002, vol. 1, pp. 65-68, Phoenix , Arizona .
006. C. C. Tang, W. S. Lu, L. D. Van, and W. S. Feng, "A 2.4 GHz CMOS down-conversion doubly balanced mixer with low
supply voltage," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2001, vol. 4, pp. 794-797, Sydney , Australia .
005. L. D. Van, S. Tenqchen, C. H. Chang, and W. S. Feng,A new 2-D digital filter using a locally broadcast scheme and its
cascade form, in Proc. IEEE Asia Pacific Conf. on Circuits Syst. (APCCAS), Dec. 2000, pp. 579-582, Tianjin, China.
004. L. D. Van and W. S. Feng,Efficient systolic architectures for 1-D and 2-D DLMS adaptive digital filters, in Proc. IEEE
Asia Pacific Conf. on Circuits Syst. (APCCAS), Dec. 2000, pp. 399-402, Tianjin, China.
003. L. D. Van, C. C. Tang, S. Tenqchen, and W. S. Feng, "A new VLSI architecture without global broadcast for 2-D systolic
digital filters ," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2000, vol. 1, pp. 547-550, Geneva , Switzerland .
002. L. D. Van, S. S. Wang, S. Tenqchen, W. S. Feng, and B. S. Jeng, "Design of a lower error fixed-width multiplier for
speech processing application," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 1999, vol. 3, pp. 130-133, Orlando ,
Florida . [PDF]
001. L. D. Van, S. Tenqchen, C. H. Chang, and W. S. Feng, "A tree-systolic array of DLMS adaptive filter," in Proc. IEEE Int.
Conf. on Acoustics, Speech and Signal Processing (ICASSP), Mar. 1999, vol. 3, pp. 1253-1256, Phoenix, Arizona.
Lan-Da Van
VLSI-DSP-1-83

Vlsidsp Chap1 PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Vlsidsp Chap1 PDF

Загружено:

Авторское право:

Доступные форматы

VLSI Digital Signal Processing Systems

Introduction to Digital Signal

VLSI Digital Signal Processing Systems

VLSI Digital Signal Processing Systems

Why Use Digital Signal Processing?

VLSI Digital Signal Processing Systems

Common System Configuration

VLSI Digital Signal Processing Systems

VLSI Signal Processing System

Adaptive digital filter

LMS/DLMS (Delay LMS) based

General purposed processor

VLSI Digital Signal Processing Systems

VLSI Signal Processing System

Error control coding

VLSI Digital Signal Processing Systems

VLSI Signal Processing System

IEEE Trans. on Biomedical Engineering

VLSI Digital Signal Processing Systems

VLSI Signal Processing System

VLSI Digital Signal Processing Systems

VLSI Digital Signal Processing Systems

Discrete cosine transform

Decimator and expander

VLSI Digital Signal Processing Systems

t->y: y=f(t), y:C, t:C

n->y: y=f(nT), y:C, n:Z

n->y: y=D{f(nT)}, y:Z,n:Z

VLSI Digital Signal Processing Systems

Assume x1(n)->y1(n) and x2(n)->y2(n), where -> denotes

Time-invariant (TI) systems

y(n0) depends only on x(n), where n<=n0

VLSI Digital Signal Processing Systems

Sampling of Analog Signals

The analog signal must be band-limited

VLSI Digital Signal Processing Systems

VLSI Digital Signal Processing Systems

Convolution & Correlation

VLSI Digital Signal Processing Systems

Linear Phase FIR Digital Filters

y(n) b0 x(n) b1 x(n 1) b2 x(n 2) b3 x(n 3)

VLSI Digital Signal Processing Systems

IIR Filter Structures

VLSI Digital Signal Processing Systems

Widely used in communications, DSP, and control

Steepest descent algorithm

Stochastic gradient algorithm

LMS algorithm, DLMS algorithm

VLSI Digital Signal Processing Systems

VLSI Digital Signal Processing Systems

1.Input Signal: X(n)

6.Misadjustm ent: M adj

VLSI Digital Signal Processing Systems

Steepest Descent Algorithm

y(n) W (n) X (n)

where X (n) [ x(n) x(n 1) ... x(n N 1)]

W (n) [w0 (n) w1 (n) ... wN 1 (n)]

The error at the n-th time is

d (n) X (n)W (n)

VLSI Digital Signal Processing Systems

VLSI Digital Signal Processing Systems

Summary of LMS Adaptive

e(n) d (n) y(n)