Вы находитесь на странице: 1из 4

Abstract of Acoustic Echo Cancellation

The quick advancement of technology in recent years has altered the entire
dimension of communication. Now a days people are more interested in hands-free
communication. In such a situation, the use a typical loudspeaker and often a high-
gain microphone, in place of obsolete telephone receiver, looks more justified.
This would allow multiple persons to participate in a conversation at the same time
such as in a teleconferencing scenario. Yet another benefit is that it would allow
the person to have both hands free and to move in the room at their ease. However,
the presence of a significant acoustic coupling between the loudspeaker and
microphone will generate a loud echo that would make conversation a little
difficult. The remedies to these problems is the removal of the echo with an echo
suppression or echo cancellation algorithm..
However, the echo suppressor has a foremost disadvantage as it supports only half-
duplex communication. Half-duplex communication allows only one speaker to
talk at a time. This drawback motivated technocrats to the invention of echo
cancellers. An important aspect of echo cancellers is that full-duplex
communication can be sustained, which permits both speakers to talk at the same
instant. The three basic components of an echo canceller are an adaptive filter, a
doubletalk detector and a non linear processor. The adaptive filter generates an
almost exact replica of the echo and subtracts it from the combination of the
produces echo and users speech.
The doubletalk detector senses the doubletalk. Doubletalk occurs when both users
speak simultaneously which stops the adaptive filter in order to avoid divergence..
In order to avoid clipping, a noise gate is used as a non linear processor . The noise
gate allows a threshold value to be set and all signals below the threshold are
removed. This action makes sure that only residual echoes were removed in the
last stage. Till date, the real time implementation of AEC is performed by utilizing
both a VLSI processor and a DSP processor. Since there has been an advancement
in computing field, all essential algorithms are implemented in MATLAB.
Acoustic Echo in Telephony
Advent of hands free telephony and teleconferencing has enabled users to
communicate with others without holding the device with their hand during
conversation. However, in such cases, numerous detrimental phenomena can
significantly harm the quality of speech being communicated. Acoustic echo is
perhaps the most troublesome amongst those. Acoustic echo is produced where
loudspeaker and microphone of a same device get acoustically coupled (fig.
1.2).when a cell phone is set on hands free off , sound wave coming out of speaker
doesnt have sufficient power level to be sensed by microphone of the same cell
phone.
In this scenario power level of speech wave is so less at output of speaker that we
cant listen to the far end talker without holding the cell phone with its speaker
near our ears (fig. 1.1). In this case by the time the sound reaches to the
microphone of the same cell phone it becomes practically insensible. Therefore, if
near end talker speaks in the microphone, only his/her sound will be sensed and
effect of far end talkers speech through speaker of near end talker wouldnt be
substantial.

Now assume a situation where the near end talker has set his/her speaker hands
free. Sound wave coming out off speaker of cell phone will now affect microphone
adversely. In case of hands free mode power level of sound emitted from speaker
of near end talker would be several times larger than that of former case. Having
traveled some distance sound wave will reach to microphone and will be picked by
the same. Therefore, along with near end talkers voice an additional signal due to
hands free environment will propagate through channel and will be received by far
end talker. as the received signal contains two components; one signal is the
speech of near end talker and other is the delayed version of his/her own signal
arisen due to hands free environment.
Due to above phenomenon far end talker listens to his own voice after considerable
amount of time as an echo. Conditions may become more severe if near end talker
is sitting in a room having several reflecting objects or having poor acoustic
immunity .It is know that if the time interval between echoes of near end talker
doesnt exceed 1/10th of a second it can go unnoticed for far end talker. More over,
as in mobile communication environment user can move any where in the room the
appearances of echoes would be different for different locations in the room. This
dynamics prompted engineers to carry out time varying modeling of acoustic
echoes by estimating echoes path.
Acoustic Echo Modeling
Echo is a phenomenon wherein a time delayed and distorted copy of an actual
sound is reflected back to the source. With rare exceptions, conversations occur in
the presence of echoes. Echoes of our speech are heard as they are reflected off the
floor, walls and other objects in the proximity. If a reflected wave reaches after a
very little time of direct sound, it is assumed as a spectral distortion or
reverberation. Nevertheless, when the leading edge of the reflected wave arrives a
few tens of milliseconds after the original sound, it is perceived as a distinct echo.
Since the advent of telephony echoes have been an issue in communication
networks.
In particular, echoes can be generated electrically due to impedance mismatches at
various points along the transmission medium. The most important parameter in
echoes is called as end-to-end delay, which is also known as latency. Latency is the
time between the generation of the sound at one end of the call and its reception at
the other end. Round trip delay, which is the time taken to reflect an echo, is
approximately twice the end-to-end delay. Echoes become annoying when the
round trip delay go beyond 30 ms. Such an echo is nominally heard as a hollow
sound. Echoes ought to be loud enough to be heard.
Those less than thirty (30) decibels (dB) are unlikely to be noticed. However, when
round trip delay go beyond 30 ms and echo strength becomes more than 30 dB,
echoes become steadily more severe. However, not all echoes degrades voice
quality. In order for telephone conversations to sound comfortable, callers must be
able to hear themselves speaking. For this reason, a short instantaneous echo,
termed side tone, is deliberately inserted. The side tone is coupled with the callers
speech from the telephone mouthpiece to the earpiece so that the line sounds
connected.[9] Mathematically if x(t) is the original signal then its one of the
components of echo can be represented as ax(t-t1) where a is the attenuation factor
and t1 is the delay encountered by the sound after reflecting from a surface. In case
of multiple path available for reflection the composite signal at the input of
microphone can be written.
C(t) = x(t) + a1 x(t-t1) + a2 x(t-t2) + a3 x(t-t3) + ......an x(t-tn) (1)
Where c(t) is composite signal , x(t) is original signal a1, a2, a3, an. are
attenuations suffered by sound from corresponding paths and t1, t2, t3, t4.tn
are underlying delays. Since today in almost all cases digital technology prevails so
the representation of a sound wave is carried out by sampling and quantizing the
electric voltage signals at twice the nyquist rate and hence the composite signal
without AEC would be
C(k) = x(k) + a1x(k-k1) + a2 x(k-k2) + a3 x(k-k3) +. an X(k-kn) + n(k) (2)
Where additional term n(k) is the noise due to digitization of analog voltage
signal.
Room Impulse Response and its Estimation
When a human being speaks in front of a microphone in an open atmosphere
having no nearby objects practically no problem of echo is observed because sound
traveling in open atmosphere wont get reflected. But in case of a closed room (fig.
2.1) microphone receives multiple signal including direct one. It may be assumed
that there exist a system whose input is a original speech signal at NET and output
is the signal received by microphone.

If a person produces acoustic impulse in front of microphone microphone wont
receive that impulse directly it would rather receive the signal coming from
different paths having reflected off different surfaces. So the signal sensed by
microphone would be as that of shown in fig. 2.2 where a series of time delayed
impulse would come in the picture.
Noise Gate as a NLP
Noise gate is used as a NLP, which is a type of dynamic processor. Noise gates
belong to the category of expanders. As the name suggests, it boost up the dynamic
range of a signal so that low-level signals are attenuated significantly while the
higher-level signals are neither attenuated nor amplified. The noise gate expansion
can be taken to the extreme where it will greatly attenuate the input or eliminate it
completely leaving only silence. While expanders are immensely difficult to use
effectively, noise gates are a very simple and effective way of reducing the
apparent noise level in audio signals. The noise gate provides a method of turning
down the gain of an audio signal when the signal value falls below some threshold
value. The threshold value needs to be large enough that only the background noise
goes below but not so high that the audio signals are cut off unnecessarily. Noise
gates are too often used to extricate noise or hiss that may otherwise be amplified.