Вы находитесь на странице: 1из 11

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO.

12, DECEMBER 2002

2915

Spectrogram Segmentation by Means of Statistical Features for Non-Stationary Signal Interpretation


Cyril Hory, Nadine Martin, and Alain Chehikian
AbstractTimefrequency representations (TFRs) are suitable tools for nonstationary signal analysis, but their reading is not straightforward for a signal interpretation task. This paper investigates the use of TFR statistical properties for classification or recognition purposes, focusing on a particular TFR: the Spectrogram. From the properties of a stationary process periodogram, we derive the properties of a nonstationary process spectrogram. It leads to transform the TFR to a local statistical features space from which we propose a method of segmentation. We illustrate our matter with first- and second-order statistics and identify the information they, respectively, provide. The segmentation is operated by a region growing algorithm, which does not require any prior knowledge on the nonstationary signal. The result is an automatic extraction of informative subsets from the TFR, which is relevant for the signal understanding. Examples are presented concerning synthetic and real signals. Index Terms 2 distribution law, maximum likelihood, region growing technique, statistical pattern recognition, timefrequency analysis.

I. INTRODUCTION HIS paper investigates a new method for the interpretation of nonstationary processes. This issue concerns the problem of defining an automatic process to support a decision from the analyzed signal. It is, for instance, the case of fault detection in industrial control but also in many domains of application. The relevant information to be extracted from a nonstationary signal is included in the time evolution of its spectral content. Techniques based on time or frequency representations of the signal are not appropriate to provide such information. Several analysis methods called timefrequency representations (TFRs) have been proposed to represent a signal in a hybrid space [7], [15]. A TFR displays the energy content of a signal along both time and frequency dimensions. The components of the analyzed signal are described in this space by structures called spectral patterns. In the literature, many approaches have been proposed to design automatic interpretation technique involving TFR. Two main classes can be drawn according to the position of the timefrequency (TF) tool in the interpretation method. In the first class, TFRs are fitted toward the objectives of the method. This is the case, for example, of methods based on reassigned TFRs [4], which provides an increased readability, or adaptive
Manuscript received August 2, 2001; revised July 8, 2002. The associate editor coordinating the review of this paper and approving it for publication was Dr. Chong-Yung Chi. C. Hory and N. Martin are with the Laboratoire des Images et des Signaux (LIS), UMR 5083, CNRS-INP Grenoble, Grenoble, France (e-mail: cyril.hory@lis.inpg.fr). A. Chehikian is with the Universit Joseph Fourier, Grenoble, France. Digital Object Identifier 10.1109/TSP.2002.805489

kernels of Cohen distributions [2], [5]. The method we propose in this paper lies within the second class of method, where the T-F interpretation is considered as a post-processing. In that case, the interpretation task does not have any influence on the performances of the T-F analysis as resolution or variance. It is relevant if designed from the inner properties of the TFR. We already proposed in [17] a processing of a TFR based on mathematical morphology tools. This method was efficient for straight spectral patterns segmentation but could not succeed in detecting slowly varying structures because of the use of the gradient function as shown in [14]. We present here a new method that is adapted to narrowband as well as wideband components. We propose to exploit the TFR statisticalpropertiestoextractandcharacterizespectralpatternswith no a priori knowledge on the analyzed signal. The methods mentioned above consider the T-F plane in a global manner. We consider here the nonstationarity described by the spectral patterns in a local approach. The basic principle is to extract local statistical features from the TFR points. Features are selected such that spectral patterns aggregate in the so-called features space. This way, spectral patterns are not characterized by an iso-energy level but by a local TF coherency. We will see that, in the particular case of TFR, the variance of the patterns increases with its energy level. Therefore, our approach concerns TF structures that could not be extracted by a constant thresholding. Furthermore, the proposed segmentation is blind toward the analyzed signal. This is of importance in industrial applications. The TFR chosen is the spectrogram because its statistical properties have been derived for stationary processes [11], [13]. We extend this study to nonstationary processes and propose a general method of interpretation based on the derived TFR statistical model. Then, we propose a first set of features and study their statistical properties. This provides a description of the features space and allows one to build an appropriate region growing method of segmentation combined with data analysis criteria. Examples on synthetic and natural signals are presented to illustrate the region growing algorithm and validate the efficiency of the method. II. METHOD Each location of a TFR is characterized by an energy level called TF coefficient. Connected points form regions that define the spectral patterns. Segmenting a TFR consists of deciding whether a coefficient belongs to some deterministic component region or to noise (or background) region. To perform such a decision, one needs more than the energy level, and this is for two reasons. First, the HeisenbergGabor inequality ensures that the energy content of the signal at instant and fre-

1053-587X/02$17.00 2002 IEEE

2916

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 12, DECEMBER 2002

Fig. 1.

Overview of the method.

quency lays on a neighborhood of this point in the TFR. Thus, the TFR coefficient cannot fully describe the signal content at instant and frequency . Second, spectrogram coefficients are randomly corrupted by embedding noise power. We thus propose to associate other features than the energy level to each location. Inorderto take intoconsideration the uncertainty principle, we define features involving sets of spectrogram coefficients. These features are chosen as statistics of the spectrogram coefficients. We associate to each point a cell of neighbor coefficients. Its size must be small regarding to the total number of points in order to ensure a local description of the TFR. A set of features is then computed over each cell. Segmenting the TFR requires a label to be associated with each spectrogram location, namely, noise or signal plus noise. As this segmenting procedure is made difficult by both the HeisenbergGabor inequality and the noise corrupting the energy level, we rather propose to transform the spectrogram energy level. In this space, which is referred to as features space, each TF location is positioned with respect to the statistical features extracted from the neighboring cell, as displayed Fig. 1. Clusters are formed in the features space by cells of similar statistical properties. We perform a region growing algorithm that operates a segmentation by associating a common label to connected points in the TFR having same properties in the features space. The region growing technique is free of tuning parameters. It is known to be stable with respect to noise [1], [3]. Thus, it can be applied to a huge range of signals without specifications. It provides a characterization of the spectral patterns with no a priori knowledge about their situation and orientation in the T-F space but based on their magnitude variations. No adjustment in time is necessary, contrary to most of the existing processings, which require an estimation of the pattern initial time. We choose the features such that they provide descriptive parameters of the T-F structures. These features, as combinations of random variables, are random variables. In the features space, they aggregate as clusters whose location and dispersion are, respectively, measured by their expected value and variance. We propose a theoretical study of these statistical properties which allows to foresee the features space position of the spectral patterns. III. TOWARDS A STATISTICAL INTERPRETATION We derive in this section the probability density function (PDF) of the spectrogram coefficients of a deterministic

sequence embedded in white Gaussian noise. We then propose a local statistical model of spectrogram. A. Spectrogram Statistical Properties Let consider the signal , which is the sum of a deterof samples and of a white ministic discrete sequence of zero mean and variance : Gaussian process with (1)

at time and frequency of The discrete spectrogram is the periodogram of this signal weighted by a window of samples (2) When is a boxcar window, which is equal to one when and zero elsewhere, the spectrogram coeffiare known to be central cients of the white Gaussian noise with two degrees of freedom and proportionality parameter [11], [13] if if and (3)

tends toward infinity [13]. It is well known [10] that PDF is a gamma distribution (4)

is the proportionality parameter, and is where the number of degrees of freedom. of (1) is a set of independent Gaussian The sequence and variance . We extend variables of nonzero means the proof proposed in [13] by considering nonzero means and is a noncentral with two degrees of conclude that , and proportionality freedom, noncentral parameters parameter if and (5)

distriSee Appendix A for the expression of the noncentral of (3) is a special case of the noncentral bution. The central with a null noncentral parameter.

HORY et al.: SPECTROGRAM SEGMENTATION BY MEANS OF STATISTICAL FEATURES

2917

Fig. 2. Cell description. White squares are central  spectrogram coefficients and black squares are noncentral  spectrogram coefficients.

Note that a small number of spectrogram coefficients can have different distributions due to the use of any time window [11]. We do not take them into account in the distribution modeling. The behavior of random variables is fully characterized by their moments. We propose in Appendix A a general expression distribution. In of the th moment about zero of a noncentral are departicular, the expected value and variance of rived from (31): (6) (7)

the TFR (which guarantees the local approach), the energy contribution of the deterministic signal can be considered to be varying slowly over the coefficients. The noncentral parameters can then be approximated by the same parameter , over the cell which is the mean of the (8) . Therefore, each The other coefficients are central of is a sample of random variable coefficient with probability and a sample of random with probability (see Fig. 2). Thanks variable can be to total probabilities formula, each coefficient of , whose considered to be a sample of the parent variable is a mixture of PDFs: PDF (9) is the PDF of a random variable . Equation (9) where is the statistical model we propose to apply to the cell. Under the assumption of (8), the PDF of the parent variable depends on the only three unknown parameters , , and . The linearity property of the Fourier transform leads to write the first of the parent variable as characteristic function (10) The th moment from (10): of the parent variable is derived

Var

. They increase linearly with the noncentral parameter This parameter describes the content of the deterministic alone at instant and frequency . If is a signal nonstationary process, then the nonstationarity to be analyzed is contained in the above moments by the noncentral parameter . The set of noncentral parameters is thus a signature of the nonstationarity. B. Local Statistical Model is a set of TFR coefficients having Each cell PDFs with noncentral parameters and variance . associated with has a Thus, the parent variable unknown parameters. As far as we want PDF defined by to define features as statistics of this parent variable, a model of the cell must be defined to reduce the number of unknown parameters. out of the points contain Consider a cell1 , where energy of the deterministic signal. Each one of the associwith noncentral parameter ated coefficients is a noncentral . As the size of is small with regard to the size of
1Indexes (n; k ) are now omitted when dealing with a single cell without confusion.

(11) and , th moments of the noncentral and Expressions of distributions are derived from (31) of Appendix A, central

2918

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 12, DECEMBER 2002

with , We thus have

, and

for

, and

for

(12) is the local signal to noise ratio over the cell. where Note that this interpretation is common in signal theory where the noncentral parameter is assimilated to a signal-to-noise ratio (SNR). Finally, each cell defined over the T-F space is described by its parent variable. The spectrogram coefficients of the cell are samples of this random variable. Its moments, which are expressed by (12), depend on two parameters: : ratio of spectrogram coefficients which bear the deterministic component energy; : SNR over the cell. The simultaneous variations of these parameters over the whole along the T-F space is related to the variations of and dimensions. They characterize the magnitude variations of the signal components. We want to extract features whose statistical behavior, which is related to the variations of and , provides an obvious discrimination of points belonging to spectral patterns of different magnitude variations. In order to limit the noise effects on the TFR readability, we select as a first feature the expected value of the parent variable (13) This feature is relevant for characterizing T-F regions of low-energy density variations. We propose to combine this processing with the extraction of a second feature: the standard deviation of the cell Var (14)

As we will show in Appendix B, it is an unbiased estimator of . When the cell contains energy of the deterministic is the -times convolution of component, the distribution of distribution (9) with a nonnull noncentral parameter if spectrogram coefficients are independent. We do not provide the analytical expression of this distribution, but in Appendix B, we derive expressions of the first and second moments of , which are, respectively, order one and two polynomials in and . Considering the case of cells containing only noise spectrogram coand ), (35) and (36) of Appendix A take efficients ( the form (16) Var (17)

is as we have already shown in [8]. In this noise-only case, random variable as it is proven with a matrix a formulation in [11]. B. Coefficients Correlation In many situations, the T-F space presents redundancy of information that signifies correlation between T-F coefficients. Let us consider the spectrogram of a white Gaussian process. Its coefficients along time axis are correlated if the time windows overlap [19], [20]. They are asymptotically uncorrelated along frequency axis [13], but the use of a weighting window also introduces correlation along the frequency axis [11]. We from discuss the deviation to the theoretical PDF of feature a simulation study. Four white Gaussian process spectrograms are generated with 50% window length overlapping or without overlapping with a boxcar or a Hanning window. Spectrograms are about coefficients. The correlation of the spectrogram coefficients decreases as the time window length increases. The correlation due to the window length is thus assumed to be negligible by using 1024 points windows. Fig. 3 presents, for each PDF compared with spectrogram, the theoretical the histograms of . Fig. 3(a) concerns the spectrogram whose coefficients are uncorrelated. In Fig. 3(b), the correlation is due to the use of a Hanning window and, in Fig. 3(c), to the overlapping. In Fig. 3(d), we combine both sources of correlation. The introduction of correlation induces an increasing of the dispersion of the data histogram. Johnson et al. [11] show that the consequence of the use of a Hanning window is a convolution PDFs with various proportionality coefficients. of several This produces a smoothing of the PDF. Time windows overlapping have the same incidence on the PDF shape. The histogram smoothing in Fig. 3(d), due to both overlapping and use of the Hanning window, is not more important than the other cases because the Hanning window reduces the correlation along the time axis [19], [20]. We will see in Section V that our segmentation procedure extracted from cells is controlled by the PDF of feature containing only noise energy. The theoretical PDF derived in Section III is not valid in the presence of correlation, which is mostly the case. Therefore, the effect of correlation on the true are PDF must be considered. Let us suppose that features with unspecified and . We show in [9] that the

This feature characterizes high-energy density variations over the T-F space. IV. STATISTICAL PROPERTIES OF THE FEATURES In the previous section, features are defined as the expected value and the standard-deviation of the parent variables associated to each local cell in the TFR. In this section, we propose estimators of these features. We also give expressions of their first two moments. These statistical properties are necessary for describing the clusters which would be obtained in the Features Space. Moreover, we discuss the influence of time window on TFR coefficient distribution, given that the theory derived in the previous section assumes independence of these coefficients. A. Local Mean Assuming ergodicity, the feature of (13) is estimated by exof the cell tracting the empirical mean (15)

HORY et al.: SPECTROGRAM SEGMENTATION BY MEANS OF STATISTICAL FEATURES

2919

Fig. 3. Comparison between histograms F (+) and its theoretical PDF (dashed lines) for a white Gaussian noise (0; 10) computed with cells of 3 5 points. Each spectrogram is computed with a 1024-points-long time window and without zero padding. The PDF is estimated by maximum likelihood (plain lines).

maximum likelihood estimators and of a tribution are accurately approximated by

dis-

(18) (19) , , and with is the number of spectrogram coefficients. Statistics and are sufficient for the number of degrees of freedom and the nonvariance is central parameter. The white Gaussian process with a low computation cost. then efficiently estimated by distributions estimated by (18) Fig. 3 also shows the central and (19). Table I gives the mean of and on 100 realizations for each spectrogram configuration and shows that correlation induces a decreasing of the number of degrees of freedom. One can conclude that whatever the configuration of the spectrogram computation, the noise power can accurately be estitaken over noise cell as a mated by considering the measure variable with . In presence of deterministic components, the histogram of is a mixture of central and noncentral PDFs. A binary classification scheme based on the expectation-maximization algorithm, for instance, might be used to identify data of the mixture. This would lead to a TFR segmentation without characterizing variations of the spectral pattern parameters. We, on the other to the hand, propose to add a second characterizing feature processing.

TABLE I ESTIMATION OF THE NUMBER OF DEGREES OF FREEDOM OF A  DISTRIBUTION OF A SPECTROGRAM LOCAL MEAN F THE CASE OF A WHITE GAUSSIAN NOISE (0; 10) AND ESTIMATION OF ITS POWER. THEORETICAL NUMBER OF DEGREES OF FREEDOM IS 2 N = 30 WITH N = 3 5

IN

C. Local Standard Deviation Assuming ergodicity, the feature of (14) is extracted by estiof the cell mating the standard deviation (20) are given by (40) and (41) The first and second moments of is the case of a noise cell. of Appendix B. The case Expressions (40) and (41) take the form [8] (21) Var (22)

2920

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 12, DECEMBER 2002

Fig. 4. Noise cluster spread. (a) Spectrogram of a white Gaussian process of zero mean and variance  = 10 and (b) the associated features space (F , F ).

Equations (16) and (21) show that the noise points form a cluster , in the Features Space located around ( ). Fig. 4 illustrates this result in the case of a white Gaussian process of zero-mean and variance with . As expected, by (16) and (21), the , ). cluster is located around ( Let consider now the case of a deterministic component embedded in a white Gaussian noise. Parameters and increase when the cell glides from noise points to a deterministic compoand is described in nent pattern. The evolution of terms of and by (35) and (41). A simulated network of curves , ) parametered by and is displayed in Fig. 5 ( to illustrate this evolution. The signature of a deterministic spectral pattern in the Features Space is a curved cluster spread from the noise area to nonzero and area, depending on the pattern maximum magnitude and on the pattern size regarding to the cell size. The shape of the cluster depends on the simultaneous variations of et . It describes the magnitude variations of the spectral pattern. Consider a spectral pattern with constant local SNR. It is an extreme case of sharp edge spectral pattern. When the cell glides through the pattern, the proportion increases first and then decreases, whereas the local SNR is constant. Its representation in the Features Space is a cluster following an iso- (plain line). The local approach allows one to simplify the statistical model of the cell. The derived properties of the features depend on the two characterizing parameters and . This provides a description of the Features Space that must be used for both the segmentation of the TFR and the description of the extracted spectral patterns magnitude variations. V. SEGMENTATION IN THE FEATURES SPACE Before describing the segmentation algorithm, we discuss the choice of the cell size. Examples on simulated and real data are presented to validate the method. A. Cell Size The wayfeatures aggregate in the Features Space depends on the cell size. On one hand, a small cell regarding to the TFR size ensures a local approach. On the other hand, the spread of the Features Space clusters decreases when the cell size increases

Fig. 5. Theoretical grid (E F , E F ) computed with a noise variance  = 10 and a cell of N = 7 3 points. The point (E F = 9:3, E F = 10) is the noise expected value. The 15 values of parameters r and p are regularly spaced between, respectively, [0,6] and [0,1] (+). Circles are Var F at r [0:43; 1:29; 3; 6] and p [0:07; 0:21; 0:5; 1].

f g f g 2

f g

f g f g 2

since and are consistent estimators of the moments. This induces an increased separability of the data in the Features Space. A local characterization by means of large cell requires large amounts of overlapping and zero padding. The counterpart of overlapping and zero padding is the smoothing of the data due to the increasing of the TFR coefficients correlation. We define the cell size as the correlation support of its central point which depends on the spectrogram configuration: size and form of the time window, overlapping, and zero padding. A compromise is provided in this way between the cell size and the dispersion of the features due to correlation. This choice also permits the characterization of each point by its region of influence. This correlation is equivalent to the amount of redundancy in the TFR. This redundancy is quantified by the square modulus , which is defined by of the reproducing kernel (23) where time and is the weighting window of (2) delayed in shifted in frequency. It measures the influence of

HORY et al.: SPECTROGRAM SEGMENTATION BY MEANS OF STATISTICAL FEATURES

2921

the point ( cell

) on the point ( by TFR

), [15]. We thus define the

(24) where is a threshold above which correlation is not negligible. As we work on discrete TFRs, the accuracy of this threshold is 10 appears to be a good choice not necessary, namely, for the approximation of the correlation support in number of points. The spread of the cell corresponds to the TF uncertainty of the TFR. B. Region Growing Algorithm The full segmentation algorithm is described in Fig. 6. We consider in the following that in the TF space, each deterministic signal region is separated from others by a noise region. Such a signal region is extracted by a mechanism in which a label is associated to one (or more) well-chosen point called a seed. This seed is propagated to the neighborhood, provided that the neighborhood has similar properties than the seed itself. The propagation operates by associating the seed label to contaminated points. Usually, this implies that we have available a similarity criterion between points to be contaminated. In our case, because the noise region properties may be derived from the parameters estimates, we choose to propagate until a noise degree of similarity imposed to the unlabeled points is reached. This criterion induces an implicit definition of contours separating deterministic component patterns from background noise in the TFR. Figs. 7 and 8 illustrate our matter with synthetic signals. In Fig. 7, the signal is composed of three patterns of similar spread embedded in a white Gaussian noise. The central spectral pattern is smoothed, although the two others present sharp edges. The right-hand component is of lower magnitude. In Fig. 8, the signal is composed of a narrowband pattern and a wideband pattern embedded in a white Gaussian noise. During the whole procedure, unlabeled points (that is labeled by zero) are assumed to belong to noise regions, and the associated s are considered PDF. Two main steps are iterated until as having a PDF. unlabeled points s fit to a Step 1) Defining the propagation limit: PDF. At iteration , the number of Estimation of noise are estimated by degrees of freedom and the noise power and of (18) and (19). Superscript the ML estimators stands for the th iteration. Segmentation limit. In the Features Space, we determine a that corresponds to the noise confidence region region. Assuming a detection error probability , we define from the such that Prob . Fig. 7(c) shows the limits defined at each one of the three iterations required for that example. Outside this confidence region, we define what we call the working area, where points are candidates for the segmentation. The working area is defined by and and implicitly determines a mask in the TFR, where patterns could be labeled. , ) Theoretical grid. The theoretical network ( [see Fig. 7(c)]. Pais computed from (35) and (41) with

Fig. 6. Segmentation algorithm.

rameters and are regularly spaced in, respectively, [0,1] and ]. [0, Step 2) Propagating the seeds. The propagation operates as a lower level iterative procedure composed of two steps. Seeds extraction. Seeds have to be selected in the previously defined working area. An histogram of the Features Space data from is computed with circular bins of radius , ) of (36) centered on each samples ( the network. For the highest ratio , points belonging to the first bin with the highest signal to noise ratio are chosen as seeds of the region [circles of Fig. 7(c)]. This way, seeds are selected in the inner part of a deterministic spectral pattern (high and high ). A common label is assigned to each one of the seeds belonging to a common bin of the histogram. Seeds propagation. From seeds previously extracted in the Features Space, the propagation operates in the TFR. Each seed

2922

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 12, DECEMBER 2002

Fig. 7. Segmentation of the spectrogram of a synthetic signal containing three narrow bands components embedded in a white Gaussian noise of variance  = 19:5. The central spectral pattern is smoothed, although the two others present sharpened edges. (a) Spectrogram contains N = 99 2 124 coefficients. (c) First theoretical grid (-o-) and the limits of propagation (plain lines) computed with P = 0:01 are superimposed to the features space. (b) Extracted regions are represented in a (d) split features space .

contaminates the candidates out of its eight nearest neighbors in the TFR by assigning them the same label. These points contamPDF is again estimated inate their neighborhood again. The without the recently contaminated points. This propagation is validated by means of a KolmogorovSmirnov test [16], which with this new controls the adequation of the unlabeled PDF. If the test is positive, the contamination is accepted, and a new label is assigned. The iteration changes when all the candidates have been tested. The algorithm then returns to step 1, considering as noise the reduced set of unlabeled points. The already-labeled points are considered as seeds and propagate under the new imposed constraints. The procedure of segmentation stops when the normalized maximum likelihood calculated to estimate the PDF of unlaconverges [Fig. 8(d)]. Note that to conclude that the beled only control parameter is the probability of error . It has a consequence on the required number of iterations but not on the segmentation result. The segmentation of the TFR of Fig. 7(a) identifies three spectral patterns separately. The algorithm converges after three iterations. Each iteration is a Features Space scale changing. This matches with the structure of the Features Space highlighted by the theoretical grid. The third component of the signal is detected during the second iteration. The scale changing induces an increase of the resolution and allows computation of the seeds seeking a denser theoretical network and detect this

focused cluster [see Fig. 7(d)]. This figure highlights the shape of its cluster in the Features Space. It presents a similar inflexion to the one of cluster (1), which is characteristic of sharp edges spectral patterns (curves of constant ). The final noise power when the white Gaussian process variestimation is . ance is The segmentation of the TFR of Fig. 8(a) identifies two different deterministic component spectral patterns in terms of their TF spread. Both patterns are smoothed so that the corresponding clusters in the Features Space aggregate to linear curves [see Fig. 8(c)]. The tail of cluster (2) is located of the in the area of highest . The variance additive Gaussian process is estimated after six iterations by . The analyzed signal of Fig. 9 is an underwater recording of the whistle of a dolphin. The spectrogram contains a succession of straight spectral patterns embedded in a colored noise. The TFR domain was reduced so that the embedding noise can be considered white. The segmentation process identifies three kinds of spectral patterns in terms of their local SNR. The segmentation result of Fig. 9(b) shows that the pattern with label (3) has the highest energy level. Two patterns were extracted with same label (2) because their magnitude variations is similar. In Fig. 9(c), we present the bins that define the initial seeds. The one concerning label (1) cannot be seen because it is too close to the noise cluster. The result is a segmentation of the TFR without any preprocessing of the TFR data.

HORY et al.: SPECTROGRAM SEGMENTATION BY MEANS OF STATISTICAL FEATURES

2923

Fig. 8. Segmentation of the spectrogram of a synthetic signal containing a sum of three linear chirps and a sum of seven truncated sines embedded in a white Gaussian noise of variance  = 10:4 ( = 10:8). (a) Spectrogram contains 127 199 coefficients. (b) Extracted regions (1) and (2) are represented in a (c) ^ split features space. (d) Algorithm has converged after six iterations.

Fig. 9. Whistle of a dolphin. (a) Spectrogram of a dolphin whistle of 400 90 coefficients is segmented (b) in four regions with labels ranging from 0 to 3. (c) Initial seeds of regions (2) and (3) are shown on the features space. (d) Histogram of the noise F after segmentation (+) and the estimated  PDF (plain line).

2924

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 12, DECEMBER 2002

VI. CONCLUSION This paper presents a general method for automatic nonstationary signal interpretation based on TFR local statistical features extraction. It does not require any prior knowledge about the analyzed signal but exploits the general statistical properties of the chosen TFR. We focus on the spectrogram. We show that spectrogram coefficients of a deterministic signal embedded in distribuan additive white Gaussian noise have noncentral tions with noncentral parameters equal to the deterministic component spectrogram coefficients. This drives us to choose as extracted features the first- and second-order statistics of the spectrogram coefficients. We show that such features randomly variate with two parameters, which measure a local SNR and the spread of the structures. These parameters describe the Features Space content. They allow the identification of spectral patterns in terms of their magnitude variations and TF spread. According to this statistical study, we propose a region growing algorithm to segment the TFR. The segmentation process is controlled by the TFR statistical properties. It iteratively leads to an efficient estimation of the noise power and a characterization of the deterministic components T-F evolution. The first-order statistic is relevant for the noise characterization when the second-order statistic permits discrimination of the deterministic structures. Works are in progress to include other features, like higher order statistics, in order to obtain an accurate description of the TF structures content. APPENDIX A MOMENTS OF A NONCENTRAL

where fined by

is the hypergeometric confluent function de-

(28) distribution is a generalization of the The noncentral follows a noncentral Rice distribution [10] as distribution. Substituting (28) with into (27) and noting (29) The first ratio under the summation can be expressed as

which leads to

The sum under the derivative is the expansion into the Taylor series of the exponential function around zero: (30) Finally, after having expressed the order derivative

DISTRIBUTION moment of a variable is given by (31) (central disThis expression concerns the case where tribution) as well. The proof is the same as the one presented by replacing the Rice distribution by a Rayleigh distribution [18]. APPENDIX B MOMENTS OF THE FEATURES

The expression of the th-moment of a random variable is given in [10] without proof. We propose the following one. independent Gaussian variables Let us consider a set of of mean and variance . The random variable , is, by definition, a random variable with nonnot equal to zero. Its PDF is of central parameter the form [10]

(25) , which is the order- modified Bessel function of where first kind, is defined by with (26)

We derive in this Appendix the expression of the first and and . These expressions second moments of the features are necessary for the Feature Space description. independent and identically disConsider a set of ) with samples tributed random variables ( ) and the th moment about zero of their ( parent variable (32) Statistics related moments are unbiased estimators of the [12] (33) with variance

is the Gamma function. and One can find in [18], for instance, the general expression of the th moment of the random variable which has, by definition a Rice distribution law ,

(27)

Var

(34)

HORY et al.: SPECTROGRAM SEGMENTATION BY MEANS OF STATISTICAL FEATURES

2925

The above expressions lead to expressions of the first and second of a cell, provided (12) of moment of the first feature and of its parent variable moments (35) Var (36)

The derivation of the first and second moments of the second is not direct since is the square root of the empirfeature ical second moment about the mean . is an asymptotically unbiased estimator of The statistic the variance (37) where able Var As Var along is the th moment about the mean of the parent vari. Its variance is given by (38) varies as and the square root derivative exists , the following approximation holds [12]: Var Var (39)

Substituting (37) and (38) into the above approximation leads to the expression of the variance of feature Var (40)

One can finally express the expected value of , Var by replacing the obtained from (12)

(41)

[2] R. G. Baraniuk and D. L. Jones, A signal dependent time-frequency representation : Optimal kernel design, IEEE Trans. Signal Processing, vol. 41, pp. 15891601, Apr. 1993. [3] Y.-L. Chang and X. Li, Adaptative image region-growing, IEEE Trans. Image Processing, vol. 3, pp. 868872, Nov. 1994. [4] E. Chassande-Motin, P. Flandrin, and F. Auger, On the statistics of spectrogram reassignment vectors, Multidimen. Syst. Signal Process., vol. 9, pp. 355362, Oct. 1998. [5] M. Davy and C. Doncarli, Optimal kernels of time-frequency representations for signal classification, in Proc. IEEE Int. Symp. TimeFreq. Time-Scale Anal., Pittsburgh, PA, Oct. 1998. [6] U. Grenander, H. O. Pollack, and D. Slepian, The distribution of quadratic forms in normal variates: A small sample theory with applications to spectral analysis, J. Soc. Indust. Appl. Math., vol. 7, no. 4, pp. 374401, 1959. [7] P. Flandrin, Time-Frequency/Time-Scale Analysis. New York: Academic, 1999. [8] C. Hory, N. Martin, A. Chehikian, and L. E. Solberg, Time-frequency space characterization based on statistical criterions, in Proc. EUSIPCO, Tampere, Finland, Sept. 48, 2000, pp. 214217. [9] C. Hory and N. Martin, Maximum likelihood noise estimation for spectrogram segmentation control, in Proc. ICASSP, Orlando, FL, May 1317, 2002. [10] N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distribution, second ed. New York: Wiley, 1995, vol. 2. [11] P. E. Johnson and G. L. Long, The probability density of spectral estimates based on modified periodogram averages, IEEE Trans. Signal Processing, vol. 47, pp. 12551261, May 1999. [12] M. G. Kendall and A. Stuart, The Advanced Theory of Statistics. London, U.K.: Charles Griffin, 1963, vol. 2. [13] L. H. Koopmans, The Spectral Analysis of Time Series. New York: Academic, 1974. [14] B. Leprettre and N. Martin, Extraction of pertinent subsets from time-frequency representations for detection and recognition purposes, Signal Process., vol. 82, no. 2, pp. 229238, Feb. 2002. [15] S. Mallat, A Wavelet Tour of Signal Processing. New York: Academic, 1999. [16] R. von Mises, Mathematical Theory of Probability and Statistics. New York: Academic, 1964. [17] V. Pierson and N. Martin, Watershed segmentation of time-frequency images, in Proc. IEEE Workshop Non-Linear Signal Image Process., Haidiki, Greece, June 2022, 1995. [18] J. G. Proakis, Digital Communications. New York: Mc Graw-Hill, 1995. [19] P. D. Welch, A direct digital method of power spectrum estimation, IBM J. Res. Devel., vol. 5, no. 2, pp. 141159, Apr. 1961. , The use of fast Fourier transform for the estimation of power [20] spectra: A method based on time averaging over short, modified periodograms, IEEE Trans. Audio Electroacoust., vol. AU-15, pp. 7073, June 1967.

where the nonzero

and

are

Cyril Hory received the M.S. degree in applied acoustics from Universit du Maine, Le Mans, France in 1999. He is currently pursuing the Ph.D. degree with the Laboratoire des Images et des Signaux (LIS), Grenoble, France. Nadine Martin received the Eng. degree in 1980 and the Ph.D. degree in 1984. She is a Director of Research at the National Centre of Scientific Research (CNRS) and the head of GOTA, a team within Laboratory LIS, Grenoble, France. In the signal processing domain, her research interests are the analysis and the interpretation of nonstationary signals. She is now working on time-frequency decision, multipulse modeling, and fault detection. Real vibratory signals are more particularly studied in relation with the mechanical model of the system. She is also directing a project on an automatic spectral analysis system (ASpect TetrAS). She is the author of about 60 papers and communications. Dr. Martin was co-organizer of the Fourth European Signal Processing Conference (EU-SIPCO), of a pre-doctoral course on the recent advances in signal processing (Les Houches 93), of the Sixth French Symposium on Signal and Image Processing (GRETSI97), and of a special session on diagnostics and signal processing at IEEE-SDEMPED97. Alain Chehikian is Professor with the Universit Joseph Fourier, Grenoble, France. He was head of Laboratoire de Traitement dlmages et de Reconnaissance de Formes (LTIRF), Institut National Polytechnique de Grenoble (INPG). His research concerns segmentation, image description, and algorithm-architecture adequation.

The s do not depend on . The variances (36) and (40) thus and tend to zero, whereas tends to infinity. Feavary as and are thus consistent estimators of the mean and tures the standard deviation of the cell. REFERENCES
[1] R. Adams and L. Bischof, Seeded region growing, IEEE Trans. Pattern Anal. Machine Intell., vol. 16, pp. 641647, June 1994.

Вам также может понравиться