Академический Документы
Профессиональный Документы
Культура Документы
Chunjian Li
Aalborg University, Denmark
y ( ; m) s ( ; m) d ( ; m)
s ( ) y ( ) d ( )
| Ss ( ) | N s ( )
S s ( ) | S s ( ) | e
j y ( )
(1)
1 (2)
Ss ( ) | S y ( ) | | Sd ( ) |
j y ( )
e
Oversuppressing ASS:
Oversuppressing PSS:
Smoothing in time:
Orthogonality principle:
E s (n) h(q) y (n q) y (n k ) 0
q
Rys (k ) h( q ) R
q
yy (k q) Wiener-Hopf equation
h( q ) R
2
E[(s (n) s(n)) ] s
2
ys (q)
q
Blue: Original
Black: Wiener filter
Green: Square-root Wiener filter
10 dB noisy sample:
Iterative WF:
Iterative WF with smoothing:
3/22/2006 Lecture notes for Speech
Communications
Further enhancement to IWF
Constrained IWF [Hansen,Clements 1987]
Apply spectral constraint inter-frame and intra-frame
using LSP transformation.
Pole-zero modeling [Flanagan 1972]
Replace WF with Kalman filtering [Gibson
1991]
Vector quantization method [Gibson 1988]
Use HMM [Ephraim 1988]
Assumptions:
- Stationary additive Gaussian noise with known PSD.
- An estimate of the speech spectrum is available.
- Spectral components (DFT coefficients) are statistically independent
and each follows Gaussian distribution (the DFT amplitude follows
Rayleigh distribution).
- The DFT phase follows uniform distribution and is independent of the
amplitude.
1 1 Ak Ak 2
p(Yk | Ak , k ) exp | Yk Ak e j k | ,
2
p ( Ak , k ) exp
d (k ) d (k ) x (k ) x ( k )
(k ) Rk2
k x k
d (k ) d ( k )
Solid line: power subtraction; dashed line: The MMSE STSA. Rpost denotes the A priori
Wiener filter. SNR estimated without smoothing (the
instantaneous SNR).
3/22/2006 Lecture notes for Speech
Communications
Comments on the MMSE
STSA estimator
The gain curve transit smoothly between the power
subtraction curve and the Wiener curve. This transit is
controled by the un-smoothed estimate of a priori SNR (Rpost).
The larger Rpost, the stronger the anttenuation.
This counter-intuitive behavior manages to flatten the spurious
spectral peaks caused by the noise at the low SNR part of the
spectrum. While WF tends to sharpen the spurious peaks at
the low SNR part of the specatrum.
The phase of the noisy speech is used as the phase of the
enhanced speech, because of the assumption of uniform
distributed phase. An independent MMSE estimate of the
phasor has nonunity modulus, thus can not be combined with
the MMSE STSA.
Suffer less musical noise than the WF.
k
where k 1 k , k and k are a priori SNR and a
v
k
Decision-Directed
Wiener Filter: MMSE LSA:
Noisy sample
3/22/2006 Lecture notes
for Speech
(0 dB):
Communications
MMSE estimator with non-
Gaussian prior
How well does Gaussian model fit the real probability distribution of DFT
coefficients?
S E (S | y)
Cs F H (FCs F H C v ) 1 y
r Hs
The covariance matrix of r can be written as a diagonal matrix with
the square of r as its diagonal elements. Then the covariance matrix
of s and S can be written respectively as:
Cs H 1Cr H H
CS FC s F H
The TFE-MMSE estimator preserves the signal spectrum better than the Wiener filter.
3/22/2006 Lecture notes for Speech
Communications
Results
TFE-MMSE stimator
TFE-Kalman filtering
Compared to
WF
Noisy (10dB)