AEC AutoTuning

Dynamic AEC Tuning
Dynamic AEC Tuning

Senthil Kumar Mani
{senthil.mani}@imgtec.com
AbstractThe performance of the acoustic echo cancellation
(AEC) system highly depends on the platform, characteristics of
microphone, speaker, audio path, related hardware and
software. Thus achieving full duplex voice communication on a
wide variety of platforms with sufficient echo cancellation is a
great challenge. Hence, majority of echo cancellers require
platform specific tuning to provide optimal performance. This
paper proposes an algorithm for tuning AEC parameters
dynamically to overcome manual tuning of echo cancellers. The
proposed algorithm uses echo return loss (ERL) estimate and
convergence stability estimate. ERL estimate is used to tune the
far-end and microphone signal attenuations and convergence
stability estimate is used to tune the threshold of non-linear
processor (NLP).
Depending on platform characteristics three distinct voice
communication modes namely full-duplex, marginal duplex and
half duplex are achieved by CSD algorithm. CSD module
dynamically tunes the NLP threshold based on time to time
variations in echo path modeling and provides echo free output.
and prevents filter coefficients to saturate or underflow or
overflow
to provide optimal performance without any platform specific

manual tuning.
Also, echo path modeling by AEEF may be sub-optimal due
to platform non-linearity, high background noise capture, etc.
In such scenarios optional blocks such as Residual Echo
Suppressor, Non Linear Processing (NLP) Controller, etc.,
may not perform as expected. This leads to significant
residual echo leakage, which severely impacts the
communication. Since, the double talk overlap in normal
communication is rare, the single talk echo should be
suppressed using NLP with minimal breaks or distortion
during double talk. To avoid manual tuning of platform for
this case, dynamic NLP threshold controller is needed in
addition to microphone and far-end signal attenuation to have
a general purpose AEC working on majority of platforms.
The paper is organized as follows: In Section II, the high
level design of Automatic AEC Tuning algorithm is
discussed. In Section III, ERL estimator and gain tuning
modules are discussed. In Section IV, convergence stability
detection algorithm is discussed. In Section V, the
experimental results are described. Section VI concludes with
summary and future research scope.
I. INTRODUCTION
The performance of the Acoustic Echo Cancellation (AEC)
highly depends on the platform, more specifically audio
interface, interface driver & related hardware, pre & post
amplifier if any, and characteristics of microphone and
speaker. So, achieving full duplex voice communication on a
wide variety of platforms with sufficient echo cancellation is a
research challenge. Hence, majority of echo cancellers require
platform specific tuning to provide optimal performance. The
major components of AEC are Adaptive Echo Estimation
Filter (AEEF) and corresponding adaptation algorithm.
Usually, AEEF is implemented in integer format
representation of either 16bit (mostly) or 32 bit (specific
cases) due to high resource requirements of implementation
in fractional format. This limits the AEC performance to
limited range of ERL cases. Under high negative ERL, filter
coefficients may saturate or underflow or overflow. One
simple way to address this problem is by attenuating the
microphone signal used for error estimation. As this error is
used for adaptation of AEEF, filter coefficients will not
saturate or overflow or under flow. However, error signal is
re-scaled to obtain the actual error and it is processed in the
subsequent blocks. Similarly, when ERL is very high, say >
30dB, the integer representation of coefficients is not enough
to estimate the echo. In this case, the far-end signal input to
AEC as reference will be sufficiently attenuated to estimate
the echo. In a varying echo path environment and for diverse
platforms, dynamic estimation of attenuations for microphone
signal and far-end signal is required for general purpose AEC
Version 1.0
II. AUTOMATIC AEC TUNING ALGORITHM

High level block diagram of AEC auto tuning algorithm is
illustrated in Figure (1). The auto tuning estimates optimal
microphone attenuation, far-end attenuation and NLP
threshold. For reasonable accuracy in ERL estimation, the
measurement is done during single talk alone regions. Hence,
ERL estimators functionality depends highly on the
discrimination of single talk regions in the microphone
output. Currently, it uses time domain power level difference
between microphone and error signal. And the ERL
measurement is invoked whenever power level of error is at
least a predefined threshold lower than the power level of the
microphone signal. For high negative ERL case, microphone
signal and for high positive ERL case, far-end signal are
attenuated accordingly.
The convergence stability detector monitors the coefficients of
the filter specifically in the dominant area and decides
convergence stability of the AEEF based on time to time
variation in filter coefficients. If AEEF has stable
convergence, the NLP controller threshold is set to lower
value for providing full duplex performance. If the AEEF has
marginal convergence or is lower than expected minimal
convergence, NLP threshold is set to higher value accordingly
to arrest the residual echo leakage from the AEC. In this case
depending on the instability in AEEF convergence there may
be voice breaks. In addition to tuning NLP threshold, the
Imagination Technologies Ltd. Confidential
Dynamic AEC Tuning

convergence stability detector helps in confirming the
microphone attenuation tuned by ERL estimator.
Farend signal, x(n)
ERL Estimator
Microphone signal, d(n)
Echo free signal, e(n)
(generallycalledaserrorsignal).ThisSTdetectionalgorithm
relies on AEEF cancellation of at least 6dB. Hence, ERL
measurement isinvokedwhenever powerlevel ofechofree
signalisatleastapredefinedthresholdof6dBlowerthanthe
powerlevelofthemicrophonesignal.Figure2illustratesthe
lowlevelblockdiagramofERLestimation.
Process Next Frame
Gain Tuning
Estimate Frame energy of x(n), d(n) and e(n)
Attenuator
M
Amplifier
M
Echo & Noise

Suppressor
Px(l)
NLP &
CNG
Estimate Long
term Avg. of Px(l)
Echo estimate signal, y(n)

Adaptive Echo
Estimation Filters
h(n)
Pd(l), Pe(l), Px(l)
PdW(l), PeW(l), PxW(l)
PxLT(l)
Convergence
Stability Detector
NLP
Threshold
Tuning
NO
PxLT(l)> 2
Pd(l)< 3
PxLT(l) > 1
Farend signal, x(n)
Figure 1: High level block diagram of AEC Auto Tuning

Notations used:
adpt_cntr
> 3
PdL(l), PxL(l)
YES
NO
Single
Talk ?
YES
YES
Attenuator, N
Estimate long
term Avg. of
PdW(l) and PxW(l)
Estimate Avg. energy

of past W frames
NO
Power Loss
Estimation
YES
ST
Discrimination
ERL
Confirmation
NO
x (n) , d (n) , e(n) , y (n)
are respectively far-end,

microphone, echo- free and echo estimate data at discrete time
instant n , h(n) is the impulse response of the echo path,
M , N are gain/attenuation factors applied on microphone
and far-end data respectively.
More details about ERL estimation, convergence stability
detectionandcorrespondingtuningalgorithmsarediscussedin
subsequentsections.
Attenuate
Mic. signal
Full
Duplex
Full
Duplex
Full Duplex
YES
NO
Half
Duplex
Half Duplex
III. ERL ESTIMATOR

ERLisamountofecholossintheacousticechopath
measured in units of decibels (dB). Generally, for better
estimation accuracy the ERL measurement is done during
singletalk(ST)aloneregionsasechocancellercannotalways
model echo path as expected. Thus, ERL estimators
functionality highly depends on the performance of
discriminationofsingletalkregionsinthemicrophoneoutput.
Anyspuriousdetectionofnearendregionassingletalkregion
willleadtowrongestimationofERL.Atthesametime,delay
insingletalkdetectionordetectingsingletalkasnearendor
doubletalk(DT)willleadtodelayinestimation.Howeverit
does not impact the accuracy of the estimation. The first
versionofAECtuningwizardemployssimplelogicforsingle
talk discrimination and it may have higher tolerance in the
measurement.Thesingletalkregionisdetectedbycomparing
the power level of microphone signal and echo free signal
Version 1.0
Figure 2: Low level block diagram of ERL estimation

Theenergyinmicrophonesignal,farendsignalandechofree
signalareestimatedforeveryframeof10msec.Shortterm
averagepowerofalltheabovesignalsisestimatedusingpast
Wframes.Echoatmicrophonesignalisobservedonlywhen
farend is active. Farend signal presence is detected by
comparing the long term average of farend power with a
predefinedthreshold.AEEFadaptswhenfarendisactiveand
nearendisinactive.However,overlapoffarendandnearend
END
Dynamic AEC Tuning

mayberare.Hence,thenumberofsamplesforwhichfarend
is active and number of samples AEEF coefficients are
adapted (adaptation counter) is monitored. If the adaptation
counter is not meeting a predefined value of active farend
samples,microphonesignalisattenuatedfurther,sothatAEEF
cancellationisofatleast6dBevenunderlowERLconditions.
ERLestimationiscomputedonlyduringsingletalkregions.
Singletalkdiscriminationblockdetectssingletalkregionsby
comparing the short term error power with short term
microphonepower.Itisassumedthatsingletalkregions,short
termerrorpowerisatleast6dBlowerthanshorttermpower
ofmicrophonesignal.Howeverthis6dBattenuationmaynot
achievableathighERL.DuringhighERLconditions,single
talkregionsaredetectedbycomparingshorttermmicrophone
power with long term farend power. During single talk
regions,ERLisestimatedbycomputingtheratiooflongterm
microphonepowertolongtermfarendpower.
Full duplex operation of AEC is enabled when single talk
discriminator does not find any single talk regions. During
ERL estimation process if farend is active and AEEF
adaptationisnotobserved,AECworksinhalfduplexmode.
AECworksinfullduplexmodeoncetheestimatedERLis
stable. ERL estimation stops once the confirmation block
confirmsthestabilityofERLestimationforcertainduration.
ThefollowingstepsareinvolvedinERLestimation:
1.
Computelongtermpoweraveragingoffarendsignal
andcomparewithapredefinedthreshold,.
Px (l )
N 1
x(i)
Where,Wiswindowof32frames.
a. Onsuccess,computelongtermpoweraveraging
ofmicrophonesignalandfarendsignal.
1 PdL (l 1) (1 1 )PdW (l ) if (PdW (l) PdL (l 1))
PdL (l )
2 PdL (l 1) (1 2 )PdS (l) otherwise

1 PxL (l 1) (1 1 )PxW (l) if (PxW (l) PxL (l 1))
PxL (l)
2 PxL (l 1) (1 2 )PxS (l) otherwise

b.
3.
Onfailure,AECwoksinfullduplexmodeifpast
modeofAECisinfullduplexotherwiseAEC
worksinhalfduplexmode.
Computetheratiooflongtermpoweraveragingof
microphonesignaltolongtermpoweraveragingof
farendsignal.
PdL (l )
L
Px (l )
ERL 10 log 10
Following Table.1 tabulates the ERL in linear scale to the

respectivelogarithmicscale.
Table.1.ERLvaluesinlogarithmicandlinearscale
i 0
P (l ) P (l 1) (1 ) Px (l )
Where, N istheframesizeof10msec.
LT
x
LT
x
a.
2.
Onsuccess,itindicatessingletalkregionsand
proceeds with ERL estimation as described in
belowsteps.
b. Onfailure,AECworksinfullduplexmode.
Compute short term power of microphone and far
endsignalsandcomparetheratiowithapredefined
threshold,.
Pd (l )
N 1
d (i)
i 0
Pe (l )
ERL
Logarithmic
scale (dB)
-18
-12
-6
-3
0
3
6
12
18
24
Linear scale
0.0158
0.0631
0.2512
0.5012
1
1.9953
3.9811
15.8489
63.0957
251.1886
N 1
e(i)
A. MicrophoneandFarendAttenuationTuning
i 0
PdW (l )
W 1
Pd (l )
i 0
PeW (l )
W 1
P (l )
e
i 0
Version 1.0
AstheAEEFcoefficientsaregenerallyrepresentedbyinteger
format,negativeERLwillleadstosaturation.So,appropriate
microphone and farend gain adjustment should be done so
thatfiltercoefficientswillnotsaturate.Thedefaultvaluesof
microphoneattenuationandfarendattenuationareprovidedin
thefollowingTable.2fordifferentaudiomodes.Thesedefault
Dynamic AEC Tuning

valuesensureAEEFtoestimateechoevenunderhighnegative
ERLconditionsandassistinfasterestimationofERL.Gain
tuningoptimizesthesevaluesdependingontheERLmeasure
asgiveninTable3,Table4andTable5forspeakeraudio
mode, handset audio mode and headset audio mode
respectively.ThemainobjectiveofGaintuningistoprovide
maximum possible full duplex AEC operation without
noticeable echo and also helps AEC to provide optimum
performanceondifferentplatformswithoutmanualtuningof
parameters.Followingsectionsdescribed theautomatic gain
tuningfordifferentaudiomodes.
expected double talk performance if the ERL is

higher than corresponding to default tunable
parameters.Once,ERLestimatorconfirmstheERL,
the optimal gains will be set and furthermore the
optimum performance of AEC can be achieved.
FollowingFigure3illustratesactualERLof12dB.
So, to achieve positive ERL, microphone signal is
attenuatedby18dB,i.e.themicrophoneattenuation
valueis3,suchthateffectiveERLis6dB.Figure4
illustratesERLtracking.
Table 2. Default tunable values for microphone and far-end

attenuation
Default
value
Audio
mode
Speaker
Handset
Headset
Microphone
Attenuation
Far-end
Attenuation
4
0
0
1
2
5
Table 3. Tuneable values in Speaker audio mode

Estimate
d ERL
-6 dB
-12 dB
0 dB
Microphone
Attenuation
3
4
2
Far-end
Attenuation
1
1
1
Figure 3:(a) Microphone signal (b) farend signal (c)Power

levels (d) Actual ERL (-12dB) (e) ERL achieved after
attenuation microphone signal (6dB)
Table 4. Tuneable values in Handset audio mode

Estimate
d ERL
6 dB
12 dB
24 dB
Microphone
Attenuation
0
0
0
Far-end
Attenuation
0
2
5
Table 5. Tuneable values in Headset audio mode

Estimate
d ERL
12 dB
24 dB
30 dB
i.
Microphone
Attenuation
0
0
0
Far-end
Attenuation
3
5
6
Speakermodetuning
Ingeneral,ERLinspeakeraudiomodemay
belowerthan0dBandsomeplatformshavenegative
ERLof24dB.Thedefaultgainvalueswillbesetto
handleverylowERLcasessothatfilteranworkson
wide range ofERL up to24dB. However, high
attenuation of the microphone signal affects the
Version 1.0
Figure 4: Estimated ERL tracking actual ERL

ii.
Handsetmodetuning
Ingeneral,ERLinhandsetaudiomodeismore
than12dB,i.e.,theechopowerlevelatmicrophoneisless
than the four times the farend signal power level at
handsetspeaker.Thedefaulttunablevaluessupportsfor
Dynamic AEC Tuning

ERLof12dB.SomeplatformshavinghighpositiveERL
requiresfarendattenuationforcompletecancellationof
echobyAEEF.Similarly,platformshavingERLlessthan
12dB requires microphone attenuation. Microphone
attenuationisalsoattenuatedwhenAEEFdoesnotadapt
duringfarendactiveregions.
Headsetmodetuning
Ingeneral,ERLinheadsetaudiomodeismore
than24dB,i.e.,theechopowerlevelatmicrophoneisless
thantheeighttimesfarendsignalpowerlevelathandset
speaker.ThedefaulttunablevaluessupportsforERLof
24dB. Microphone attenuation is also attenuated when
AEEFdoesnotadaptduringfarendactiveregions.
AEEF
adpt_cntr
20ms
Yes
Timer
==
20ms
No
DT/Noise/Near End
Alone
Yes
ST Alone
No
iii.
Parse Echo
Path Filter
~
W (l , s )
Dominant
Effective
Echo Region
~
We (l , k )
~
Wed (l )
Modeled
Echo Path
Variation
Parameter
NLP
Threshold
Tuning
lt (l )
NLP
Threshold
Confirmation
Attenuation
Update
InadditiontogaintuningwithrespecttoERLestimation,if
convergencestabilitydetectorindicatesanysaturationinthe
AEEFcoefficients,gaintuningisdoneaccordinglytoavoid
saturationsinAEEFcoefficientsforechocancellation.
Process Next
Frame
No
ERL Estimator
Module
Frozen?
Yes
END
IV. CONVERGENCE STABILITY DETECTION

Convergence stability detection (CSD) module
detectstheconvergence oftheAEEF impulse responseand
accordingly tunes the NLP threshold to provide echo free
output.CSDdecisionisbasedonthetimetotimevariationsin
echopathestimation,saywhenthefarendisactiveandecho
pathisnotatallmodeledorwhentheechopathmodelingis
improperthenthesystemistunedforhalfduplex.Whenthe
modeledechopathprovidesacancellationof6dB,thenNLP
threshold is tuned to give maximum possible duplex
depending on AEEF convergence. When the echo path
modelinggivesacancellationofmorethan20dB,thenNLP
thresholdistunedtogivefullduplexcommunication.
Figure 5: Low level Architecture of Convergence Stability

Detection
Figure 3 illustrates low level architecture of Convergence
StabilityDetectionmodule.ImpulseresponseofAEEFfilter
oftaillengthNissplitintoKsubparts,eachoflength4ms
and overlap of 1ms. For every 2ms Effective Echo Region
(EER)isestimatedbyparseechopathfilter.EERdefinesthe
predominant energy regions in modeled echo path. For
identifying EER, energy of impulse response in every 3
consecutivesubpartsiscalculatedforevery2ms.
Forevery 20ms,estimated energies inEERs areordered in
descending order and top 3 in the order are updated as
dominant effective echoregions(DEER).Dependingonthe
energylevelinDEER,3regionsaredefinedtotriggerERL
estimator, shown in Table 6. In Region 1, energy level in
DEER is marginal; in this case mic. attenuation should be
maintained.InRegion2,energylevelinDEERisverylow;in
thiscasemic.attenuationshouldbeincreased.InRegion3,
energy level in DEER is saturating, in this case mic.
attenuationshouldbereduced.
Table 6: Microphone Attenuation based on DEER Energy
Energy in
DEER
Very low
Marginal
Saturation
Version 1.0
Microphone
Attenuation
Increment
Retain
Decrement
Dynamic AEC Tuning

BasedontheAEEFadaptationcounter( AEEF_adpt_cntr
),theimpulseresponseunderconsiderationisclassifiedinto3
regionsasgiveninTable7.If AEEF_adpt_cntr counter
ishighthentheregionissingletalkechoregion.Ifcounteris
marginalthentheregionisdoubletalkregion.Ifcounteris
lowthenregionisnearend/noiseregion.Duringsingletalk
region,ModeledEchoPathVariationParameter( MEPV )is
estimatedindominantregions. MEPV Parameterdescribes
the variations in echo path modeled. In normal scenario,
wherein the AEEF adaptation is proper, MEPV value is
low.WhentheAEEFadaptationisincorrectduetoimproper
ERLestimateornonlinearityinplatforms, MEPV valueis
high. Based on the MEPV parameter, NLP threshold is
tuned toprovide echo free output. Figure 6illustrates NLP
thresholdforgivenimpulseresponse.
Table 7: Classifications of Regions based on AEEF
Adaptation Counter
AEEF Adaptation
Counter
< 2ms
10ms
20ms
p K *4
(h(n)) ;
E (i )
n p
p 1, K * 4,.., k * K * 4;
( N 4 * fs )
k 1,2,..,
3 * fs
forthepresentdiscussionK=3;
3.
Step 2 is repeated for 20ms. At the end of 20ms,

effective echo regions are estimated as top 3 regions
of descending ordered impulse response energy
buffer .
4.
AEEF adaptation counter indicates the number of

samples for which AEEF is updated. Table 6
tabulates counter value and related region for a time
frame of 20ms.
5.
During Single Talk Echo, dominant effective echo

region, in effective echo regions is identified by
~
comparing ed (l ) , the ratio of dominant energy to
total energy in effective echo region, with a
predefined threshold, th 0.2 .
Acoustic
Region
Noise/Near end
Double talk
Single Talk Echo
10
ET (l )
E d (l , p ) max E (i ) 1 p 3 ;
4000
i 1to10
2000
E (l )
~
ed (l ) d
th
PT (l )
0
-2000
-4000
0.5
1.5
2.5
5
x 10
6.
Tuning Parameter for NLP Threshold
Expected NLP Factor

Estimated NLP Factor
0.8
For every time frame of 20ms, Short term Modeled

echo path variation parameter (), s (l ) , is
estimated as the standard deviation of weights in
DEER
0.6
0.4
s (l )
0.2
~
(W (i ) Wavg (i )) 2
i 1
3
Samples -->
6
5
x 10
7.
Figure 5: CSD Performance for iPhone5 test data.

Details of the convergence stability detection are mentioned
below.
2.
E (i)
i 1
Impulse Response
1.
1 i 10;
Impulse response of AEEF filter of tail length N is

split into K sub parts, each of length 4ms and with
an overlap of 1ms.
For every 2ms of microphone signal processed,
energy in each block of K sub parts is estimated and
stored.
Version 1.0
Estimated short term MEPV is normalized with

peak energy of the effective echo region to remove
normalize platform variations and maintain stable
dynamic range.
s (l )
8.
s (l )
max E d (l , p )
Long term average of, lt (l ) is calculated to

remove short term discontinuities/ sudden variations.
lt (l ) * lt (l ) (1 ) * s (l )
where is long term averaging factor.
Dynamic AEC Tuning

9.
Using NLP threshold tuning parameter , long term

average of weights is updated
Wavg ( n) nlp * Wavg (n) (1 nlp ) * W ( n)

10. Based on values, NLP threshold tuning controller
parameter is updated.
a. Re gion A; lt (l ) 1 : Half
duplex
mode,
b.
Re gion B; 1 lt (l ) 2 : Partial
duplex mode,
c.
d.
Re gion C ; 2 lt (l ) 0 : Full-duplex
mode,
Also, in a case where in far end is active but
impulse response is not at all modelled,
results in very low MEPV estimate. This
leads to echo leakage because of low value,
in such cases is set to maximum to
achieve half duplex communication.
11. Decision logic is based on the observation of factor

in the past 20 time frames, i.e., 400ms.
a. Three counters for regions, A, B and C are
used to trigger NLP threshold tuning
parameter, update.
b. Say is in region X, a counter is started and
incremented as long as is in the same
region X, where region X can be any one of
the regions A, B and C.
i. The counter is checked for
maximum for the past 20 time
frames, if it is maximum, then
value is updated as
nlp x * (1 x ) * nlp
ii.
If any double talk or near end data
appears, is in same region then
counter is reset.
iii. If changes frequently to other
regions then counters in other
regions are incremented. At the
end of 400ms time frame, is
updated with value; corresponding
to the region with maximum
counter value.
iv. For the above case: iii, counters in
other regions are reset after 400ms.
12. NLP threshold confirmation module verifies whether
NLP threshold parameter is frozen
a. If frozen, then NLP threshold tuning
parameter value is frozen and algorithm is
stopped.
b. If not frozen, then NLP threshold tuning
parameter is updated and algorithm is
repeated.
V. CONCLUSION
An Automatic AEC tuning algorithm is proposed. The
proposed algorithm estimates ERL and tunes attenuation
factors of microphone, far-end for higher echo cancellation
and also estimates the convergence stability of modeled echo
path to update NLP threshold parameter to provide echo free
output. Based on platform characteristics and modeled echo
path, tuned NLP threshold parameter triggers full duplex,
marginal duplex or half duplex voice communication.
VI. REFERENCES
Where x is a region specific (A, or B or
[1] Alberto Carini, The Road Of An Acoustic Echo

Controller For Mobile Telephony From Product
Definition Till Production, International Workshop on
A 0.95004; B 0.8999; C 0.8001; Acoustic Signal Enhancement, 2001.
[2] Biskett and Goubren Limitations of Hands-free AEC
In the event that there is a pure delay
due to Nonlinear Loudspeaker Distortion and Enclosure
variation or in a non-linear platform, echo
Vibration Effects, IEEE Int. Conf. Acoustic, Speech,
leakage occurs before AEEF adapts. In such
Signal Processing, Detroit, Michigan, pp. 103-106, May
cases, say if transition is occurring from
1995.
region C to region A, i.e., full duplex to half
[3] Christina Breining et al, Acoustic echo control- An
duplex, then NLP threshold parameter
Application of Very High Order Adaptive Filters, IEEE
Signal Proc. Magazine, pp. 42-69, July 1999.
update should be fast enough to avoid echo
[4] E. Hansler: From algorithm to system - its a rocky
leakage. For such cases, region specific
road, International Workshop on Acoustic Echo and
smoothing
parameter
is
updated,
Noise Control, Sep.1997, London, UK.
C A 0.049989 .
[5] THE
FULLLY
NETWORKED
CAR,
http://www.itu.int/dms_pub/itut/oth/06/10/T06100008020001PDFE.pdf
C) smoothing parameter to provide smooth
update for NLP threshold tuning parameter.
Version 1.0
Dynamic AEC Tuning

[6] Adaptive acoustic echo canceller, Application number:
CA 2099575, Peter L. Chu, Picturetel Corporation.
URL: http://goo.gl/TH2b3I.
Version 1.0

AEC AutoTuning

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

AEC AutoTuning

Загружено:

Авторское право:

Доступные форматы

Dynamic AEC Tuning

Dynamic AEC Tuning

to provide optimal performance without any platform specific

II. AUTOMATIC AEC TUNING ALGORITHM

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

Echo free signal, e(n)

Echo & Noise

Echo estimate signal, y(n)

Pd(l), Pe(l), Px(l)

PdW(l), PeW(l), PxW(l)

Farend signal, x(n)

Figure 1: High level block diagram of AEC Auto Tuning

Estimate Avg. energy

x (n) , d (n) , e(n) , y (n)

are respectively far-end,

III. ERL ESTIMATOR

Figure 2: Low level block diagram of ERL estimation

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

1 PdL (l 1) (1 1 )PdW (l ) if (PdW (l) PdL (l 1))

2 PdL (l 1) (1 2 )PdS (l) otherwise

2 PxL (l 1) (1 2 )PxS (l) otherwise

Following Table.1 tabulates the ERL in linear scale to the

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

expected double talk performance if the ERL is

Table 2. Default tunable values for microphone and far-end

Table 3. Tuneable values in Speaker audio mode

Figure 3:(a) Microphone signal (b) farend signal (c)Power

Table 4. Tuneable values in Handset audio mode

Table 5. Tuneable values in Headset audio mode

Figure 4: Estimated ERL tracking actual ERL

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

IV. CONVERGENCE STABILITY DETECTION

Figure 5: Low level Architecture of Convergence Stability

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

Step 2 is repeated for 20ms. At the end of 20ms,

AEEF adaptation counter indicates the number of

During Single Talk Echo, dominant effective echo

Tuning Parameter for NLP Threshold

Expected NLP Factor

For every time frame of 20ms, Short term Modeled

Figure 5: CSD Performance for iPhone5 test data.

Impulse response of AEEF filter of tail length N is

Estimated short term MEPV is normalized with

Long term average of, lt (l ) is calculated to

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

Using NLP threshold tuning parameter , long term

Wavg ( n) nlp * Wavg (n) (1 nlp ) * W ( n)

11. Decision logic is based on the observation of factor

Where x is a region specific (A, or B or

[1] Alberto Carini, The Road Of An Acoustic Echo

Imagination Technologies Ltd. Confidential

Dynamic AEC Tuning

Imagination Technologies Ltd. Confidential

Вам также может понравиться