Вы находитесь на странице: 1из 4

Model Based Abnormal Acoustic Source Detection Using

a Microphone Array
Heungkyu Lee1, Jounghoon Beh2, June Kim3, and Hanseok Ko2
1

Dept. of Visual Information Processing,


Korea University, Seoul, Korea
2
Dept. of Electronics and Computer Engineering,
Korea University, Seoul, Korea
3
Dept. of Information and Communication Engineering,
Seokyeong University
Abstract. This paper proposes the model based detection method of abnormal
acoustic source using a microphone array. General source location algorithm
using a microphone array can be used to locate a dominant acoustic source,
while this does not verify whether the detected source is permitted one or not on
outdoor environments. It is difficult to discern it among a natural environmental
sound. Thus, to cope with this problem, we propose the out-of-normal acoustic
rejection method based on N-best likelihood ratio test using natural environmental sound models. In order to evaluate the proposed algorithm, a real-time
DSP was constructed, and experimental evaluation is described.

1 Introduction
This paper is motivated by the need of abnormal acoustic source detection capability
to compensate the detection performance of image sensor [1][2]. When the image
sensor rotates between left to right or vice versa, specific boundary region that is out
of camera view is not covered for security monitoring. Thus, abnormal intrusion of
suspicious person can be occurred using such a small defect. To cope with this issue,
microphone array technology can be used to locate an abnormal acoustic source and
obtain its coordinates. The abnormal acoustic source is defined as speech signal and
manually made acoustic signal. Using the microphone array technology, a Time
Difference Of Arrival (TDOA) computation between the signals of array is done as a
fist step. Because the accuracy of estimation for direction of arrival angle (DOA) is
especially poor on noisy outdoor environment, we employ the end-point detection
algorithm to detect the acoustic source greater than the pre-defined and adapted
threshold value. In this point, we can not know whether the detected acoustic source
is valid one or not. There are sounds of wind, rain, birds singing, thunder, and a
breaking wave as a natural one on outdoor environments. Meanwhile, manually made
acoustic source is various like sounds of speech, a footstep of the person, breaking a
steel barred window, and so on. Thus, to resolve this issue, we model the sounds of
natural environments using HMM (Hidden Markov Model), and then we verify the
detected acoustic source. By using the environmental sound models as anti-models,
we propose the out-of-normal acoustic rejection method based on N-best likelihood
ratio test (LRT). Figure 1 describes the overall process flow.
S. Zhang and R. Jarvis (Eds.): AI 2005, LNAI 3809, pp. 966 969, 2005.
Springer-Verlag Berlin Heidelberg 2005

Model Based Abnormal Acoustic Source Detection Using a Microphone Array

967

Fig. 1. System block-diagram for detecting abnormal acoustic source

2 Detection and Verification of Abnormal Acoustic Source


For abnormal acoustic source detection as shown in Figure 1, six-channel microphone
array is used to locate a dominant acoustic source in a given environment. The distance between microphones is 10 Cm and we use the microphone pair. To reduce the
false acceptance of normal acoustic source and estimation error of the DOA, we apply
the DOA computation procedure just to the detected source using end-point detection
algorithm [3] that is the most widely utilized in speech recognition technology.
Among various end-point detection approaches, we apply the energy-based methods
that are most widely applied solutions to this problem using some parameters: signal
energy, zero-crossing rate, duration, and linear prediction error energy. This method
is applied after a time difference of arrival computation is done using delay-and-sum
beam-forming. Next, to estimate the DOA, a time difference of arrival computation
for source location is used using the cross-power spectrum phase (CPSP) method [2].
We use only the phase information in the crosspower spectrum of the two signal because an effective approach is to whiten the input signals if no a priori knowledge
about the statistics of the involved signals is available.
To verify whether the detected acoustic source is valid one or not, we apply the
HMM decoding module, and then LRT test is performed. Because the manually made
acoustic sources have lots of sounds, we utilizes the environmental sounds database as
reverse models. That is, we make the acoustic models using environmental sounds
database. And then, we employ the N-Best out-of-normal acoustic (OONA) rejection
method for the final decision-making. If the detected acoustic source is decided as one
of the employed acoustic models made by environmental sound database, the acoustic
type is decided on the valid one. Otherwise, all of detected acoustic sources are considered as the abnormal acoustic type. The natural sound models of wind, rain, birds
sing, and the sound of waves breaking on the beach are made using Continuous Hidden
Markov Model with 24 state, 16 mixture left-right HMM as a discriminate function.
First, the verification procedure of abnormal acoustic source is computed by
Wk = arg max (L (O / 1,...,k ))
k

(1)

where O is the observation sequence, Wk is the most likely acoustic type, and is
environmental sound models. Then, N-Best OONA rejection method based on subwords induced by likelihood ratio test (LRT) [4] as follows;

968

H. Lee et al.
LRT ( X ) =

P ( X / H 0 ) P (On / n )
=

P ( X / H 1 ) P (On / n )

(2)

where H0 and H1 means that hypothesis is true and false. is a given threshold
value. This equation can be changed for the N-best models given by
LRT (On ) =

1
ln

1 nBest

log P(On / 0 ) nBest log P(On / m )


m =1

(3)

where the model 0 is an environmental sound model that has maximum likelihood
scores and m is environmental sound models that have N-best high likelihood
scores. The variable, nBest is the number of most likely sequences. Finally, the likelihood ratio is compared with given threshold value for verification task. If its value
is below the threshold, the candidate is considered as abnormal one because it proves
that the overall likelihood scores are similar and there is no corresponding model in
given acoustic model. Thus, we decide that the detected acoustic source is abnormal
one.

3 Experimental Results
For acoustic source input to detect an occurred acoustic sound, the sampling rate is
11Khz PCM. Acoustic signals are analyzed within 125ms frame with 10ms lapped
into 39th order feature vector that has 13th order MFCCs including log energy and
their 1st and 2st derivatives. The training data set is collected from the natural scene
and previously recorded waves. The total number of classes is five, and total recording time is about 3 hours. It is composed of sounds of wind, rain, birds singing,
rain and thunder, and a breaking wave. For testing a data, Aurora2 speech DB is
used.
The sound model is constructed using the left-to-right Continuous Hidden Markov
Model (CHMM) having 50 states and 16 mixtures. First, we evaluate the recognition
performance when we apply the environmental sound waves to the proposed system
in order to verify the training accuracy. In addition, this is to verify that the constructed acoustic model is robust even when the false alarm (environmental sounds) is
detected because the false alarm should be discarded. The result according to the
number of mixtures is shown in Table 1.
Table 1. Recognition performance to verify the training accuracy

No. of mixture
Recognition rate (%)

1
86.02

2
94.37

4
95.26

8
96.20

16
96.84

32
96.08

To evaluate the OONA rejection rate, the speech data using Aurora 2 DB is applied. In the proposed system, speech signal is considered as abnormal acoustic
source. As shown in Table 2, all of speech data is classified as speech data. That is,
all of speech data is decided as an abnormal acoustic sound. Some of environmental
sound type is considered as other type. But, this result does not affect the final result

Model Based Abnormal Acoustic Source Detection Using a Microphone Array

969

because the all of environmental sounds are classified as valid class. From the result,
abnormal acoustic verification rate showed good performance when a speech signal is
detected. This result is due to the fact that feature vectors of speech signals using
MFCCs are very different with the one of environmental sounds. In addition, the
manually made acoustic signals also showed that that they are very different with the
environmental sounds even if the test data are not enough to evaluate.
Table 2. Confusion matrix (TND: Total Number of Data, ACC:Accuracy, %)

Speech
Beach
Bird
Rain
Thunder
Wind

Speech
1064
0
0
0
0
0

Beach
0
268
5
2
1
0

Bird
0
1
18
0
0
0

Rain
0
0
0
178
26
0

Thunder
0
0
0
0
3
0

Wind
0
0
0
0
0
15

TND
1064
269
23
180
30
15

ACC
100
99.6
78.3
98.9
10.0
100

4 Conclusions
In order to verify whether the detected source is valid or not, we proposed the out-ofnormal acoustic rejection method based on the N-best likelihood ratio test using natural environmental sound models. From the result, the verification rate of abnormal
acoustic source showed good performance when the speech signal is detected.

Acknowledgements
This work was supported by grant No. 10012805 from the Korea Institute of Industrial Technology Evaluation & Planning Foundation.

References
[1] J.A. Cadzow, "Multiple source location-the signal subspace approach," IEEE Trans. On
Signal Processing, Vol. 38, Issue 7, pp. 1110-1125, July 1990.
[2] C.H. Knapp, and G.C. Carter, The Generalized Correlation Method for Estimation of
Time Delay, IEEE Trans. On Speech and Signal Processing, Vol. ASSP-24, n.4, August
1976.
[3] C.E. Mokbel and G. F. A. Chollet, Automatic word recognition in cars, IEEE Trans.
Speech and Audio Processing, vol 3, pp. 346-356, Sept 1995.
[4] E. Lleida, and R.C. Rose, Utterance verification in continuous speech recognition, IEEE
Trans. On Speech and Audio Processing, Vol. 8, March 2000.

Вам также может понравиться