Вы находитесь на странице: 1из 5

540

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 7, NO. 3, JULY 2010

A New Sparse Representation of Seismic Data Using Adaptive Easy-Path Wavelet Transform
Jianwei Ma, Gerlind Plonka, and Herv Chauris
AbstractSparse representation of seismic data is a crucial step for seismic forward modeling and seismic processing such as coherent noise separation, imaging, and sparsity-promoting data recovery. In this letter, a new locally adaptive wavelet transform, called easy-path wavelet transform (EPWT), is applied for the sparse representation of seismic data. The EPWT is an adaptive geometric wavelet transform that works along a series of special pathways through the input data and exploits the local correlations of the data. The transform consists of two steps: reorganizing the data following the pathways according to the data values and then applying a 1-D wavelet transform along the pathways. This leads to a very sparse wavelet representation. In comparison to conventional wavelets, the EPWT concentrates most of the energy of signals at smooth scales and needs less signicant wavelet coefcients to represent signals. Numerical experiments show that the new method is really superior over the conventional wavelets and curvelets in terms of sparse representation and compression of seismic data. Index TermsAdaptive wavelets, curvelets, easy-path wavelet transform (EPWT), seismic processing, sparse representation.

I. I NTRODUCTION

URRENT seismic acquisitions deploy a few ten thousands of shot gathers [24]. Each shot point may also contain more than 104 receivers for land seismic data. For a single sourcereceiver pair, the signal is typically recorded during a few seconds at a rate of a few milliseconds, leading to a few thousand samples per trace. In total, 3-D acquisitions easily contain data volumes of a few terabytes. In that context, an efcient transform for compressing data and later processing the data is a key element. The most striking feature in seismic data is the presence of texturelike wavefronts. In the last two decades, wavelets have been used as one popular tool for seismic data processing. A primary aim of wavelet expansions is to nd a sparse representation for seismic data. A signal representation is sparse when it can capture a signal with a small number of signicant coefcients or components. Generally, a sparser transform is attractive for signal processing tasks, e.g., data compression and

Manuscript received August 8, 2009; revised November 17, 2009 and January 15, 2010. Date of publication March 29, 2010; date of current version April 29, 2010. This work was supported in part by the National Natural Science Foundation of China under Grant 40704019 and in part by the Projects PL 170/11-2 and PL 170/13-1 of Deutsche Forschungsgemeinschaft. J. Ma is with the Institute of Seismic Exploration, School of Aerospace, Tsinghua University, Beijing 100084, China (e-mail: jma@tsinghua.edu.cn). G. Plonka is with the Department of Mathematics, University of DuisburgEssen, 47048 Duisburg, Germany. H. Chauris is with the Centre of Goscience, Mines ParisTech, 77305 Paris, France. Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/LGRS.2010.2041185

fast forward modeling, taking advantage of the multiscale and local temporal-frequency analysis. Tensor product 2-D wavelets are not optimal for representing geometric structures because their support is not suited for directional geometric properties. Within the last years, several geometric wavelets, including curvelets [1], [2], shearlets [10], and contourlets [7], have been proposed to represent the directional features. These nonadaptive highly redundant function frames have strong anisotropic directional selectivity. Recently, curvelets have been applied to the elds of seismic exploration, e.g., seismic denoising [11], [18], [22], data recovery [12], multiple removal [13], migration [3], imaging [8], and forward modeling [23]. More applications of curvelets can be found in a recent review paper by two of the authors [16]. However, curvelets are quite redundant, i.e., the number of coefcients in the curvelet domain is about three to seven times larger than the number of pixels in the original data. This redundancy factor is useful for denoising, but it prevents the use of curvelets in data compression and forward modeling, where a sparse representation is very important. Moreover, curvelets or shearlets lose their almost optimal approximation properties when the data are composed of features having singularities along curves which are not exactly C 2 smooth. In order to overcome these drawbacks, some adaptive wavelet transforms have been presented in the image processing community. For instance, Le Pennec and Mallat [15] proposed bandlet orthogonal bases and frames that adapt the geometric regularity of data. The main idea of bandlets is that they warp wavelets along a geometric ow and generate bandlet orthogonal bases in different bands. Dekel and Leviatan [6] introduced a geometric wavelet transform, based on an adaptive binary partition of the image domain to match the geometric features of the images. Krommweh [14] proposed an adaptive Haar wavelet transform, named tetrolet transform, which allows the so-called tetromino partitions such that the local image geometry is taken into account. Some other approaches for locally adaptive image representations also exist, based on the lifting scheme, e.g., [4] and [9]. Using the seislet transform [9], one needs a priori information such as the local dip along which the signal is smooth, but then, the stability of the wavelet transform strongly suffers, and the obtained results only marginally outperform the usual fast wavelet transform. Recently, a nonlinear locally adaptive easy-path wavelet transform (EPWT) has been proposed by Plonka [19] for a sparse representation of 2-D images. The EPWT is related with the idea of grouplets [17], where one applies a weighted Haar wavelet transform to points that are grouped by a socalled association eld. The concept of EPWT is very effective and simple to understand. Starting with some suitable point of a given data set, one seeks a path through all data points

1545-598X/$26.00 2010 IEEE

MA et al.: NEW SPARSE REPRESENTATION OF SEISMIC DATA

541

such that there is a strong correlation between neighboring data points along this path. By choosing the best neighbor strategy using a given optimality rule, all data points are used in the pathways only once. Then, one can apply a suitable 1-D discrete wavelet transform to the function along the path. The choice of the path vector ensures that most wavelet coefcients remain small. The same idea is repeated to the wavelet low-pass part. The EPWT has a generalized multiresolution structure and adaptive scaling, and wavelet functions depend on the considered images. It also works well for other data structures, e.g., for data on a sphere by suitable extension [20]. It has been proved that the EPWT leads, for a suitable choice of the pathways, to optimal N -term approximation for piecewise Hlder continuous functions with singularities along the curves [21]. The idea of EPWT is not restricted to the 2-D case; it can also be applied to higher dimensional data. In this letter, we apply the EPWT, for the rst time, to obtain sparse representations of seismic data in the context of oil and gas exploration. Numerical experiments show that it is promising for potential applications in seismic elds. II. EPWT In each decomposition level, the EPWT includes two basic steps: 1) nding the path vector and then 2) applying a 1-D discrete wavelet transform along the path vector. Let N1 and N2 be two positive integers with N1 N2 = 2L s and L, s N. Let f = (f (i, j))(i,j)I be a digital function, where I = {(i, j) : i [0, N1 1], j [0, N2 1]} denotes an index set. In the elds of seismic exploration for a 2-D shot gather, i denotes an index of time coordinate, and j denotes an index of spatial receiver coordinate. Let J = J(I) be a 1-D index set for I by rearranging I(i, j) column by column, i.e., J(i, j) := i + jN1 . Furthermore, we dene a neighborhood of an index (i, j) I by N (i, j) = {(i1 , j1 ) I \{(i, j)} : |ii1 | A , |j j1 | A } . In this letter, we consider A = 1. Hence, an index that does not lie at the boundary has eight neighbors, an index at the boundary has ve neighbors, and an index at a corner has only three neighbors. In the rst step of the EPWT, we determine a path vector pL through the index set I, resp. J(I), such that successive components pL (i) and pL (i + 1) in the vector are indexes that are (usually) neighbors, i.e., pL (i + 1) N (pL (i)). This path vector has to be locally adapted to the digital function f . If using A > 1, one can choose the next index from an extended neighboring area. This nonlocal strategy can lead to better performances at the cost of more computations. Let us consider an L-level decomposition of the EPWT. At the rst level, we determine a complete path vector pL through I resp. J(I) and then apply a 1-D periodic wavelet transform to the function values along this path pL . For simplicity, we use the 1-D index set J(I). We start with pL (0) := 0 and search the minimum of the absolute differences of the function values corresponding to the neighborhood of index 0, to determine the second index pL (1) pL (1) := arg min
k

absolute differences of the function values corresponding to the neighborhood of index N1 + 1 except index 0 that has been used already in the path vector pL . We then have pL(2) := arg min f L(N1 + 1) f L(k) , k {1, 2, N1 , N1 + 2, 2N1 , 2N1 + 1, 2N1 + 2}}. Generally, given the index pL (l), l [0, N1 N2 1], we determine the next value pL (l + 1) by pL(l+ 1) := arg min f L pL(l) f L(k) ,
k k

k N pL(l) , k = pL (v), v = 0, . . . , l . We proceed in this simple manner to determine a path vector through the index set J(I) that is locally adapted to the function f . This is the so-called easy path. With the help of this path vector, we suitably reorder the data values according to their size and put the points with similar values together. The aim is to nd a path vector of indexes such that the absolute differences between the corresponding neighbored function values along the path remain as small as possible (i.e., make the function smoother), so that the wavelet transform along the path can result in less signicant wavelet coefcients. In this way, we achieve a sparse representation of the data. If the choice of the neighborhood is not unique, one may use a pregiven rule (e.g., a favorite direction) to determine a unique pathway. If, for a given pL (l), no neighbor can be chosen (i.e., all indexes in the neighborhood have already been used in path pL ), one needs to interrupt the path and starts with a new index pL (l + 1) from the remaining unused indexes in J(I). After nding the complete path vector pL , we apply a onelevel discrete orthogonal or biorthogonal wavelet transform to the vector of function values (f L (pL (l))) along the path pL . N1 N /21 We obtain low-pass smooth coefcients (cL1 )l=0 2 and l N1 N /21 high-pass wavelet coefcients (dL1 )l=0 2 after the onel level decomposition. Then, we apply the same strategy to the low-pass part and carry out a path vector pL1 for the second-level wavelet decomposition. For this purpose, we relate the low-pass coefcients cL1 to two neighboring indexes pL (2l) and l pL (2l + 1) and seek a path vector through the obtained index sets {pL (2l), pL (2l + 1)}, l = 0, . . . , N1 N2 /2 1, where we, again, exploit the local correlations between neighbored data values. We repeat the same strategy in further levels. The number of decomposition levels depends on the data. Usually, one can apply ve to ten levels to obtain satisfying results. The inverse EPWT needs to use the indexes of the vector p = (pL , pL1 , . . .) containing the path vectors of all levels. Therefore, the path vector p has to be stored in memory. The searching and storage of the path vectors cause essential costs in EPWT. In Section III, we offer a way how these costs can be considerably reduced for seismic data. The EPWT can be understood as an adaptive geometric wavelet transform. Here, we just outline the forward and inverse EPWT algorithm in Tables I and II, respectively. For more details on the EPWT, we refer to [19]. III. C ODING S TRATEGIES FOR THE PATH V ECTORS OF EPWT The bottleneck of the EPWT is certainly the cost related to storing the path vectors pj . However, particularly for seismic

f L (0) f L (k) , k {1, N1 , N1 + 1} .


L

For instance, assuming the second value p (1) is equal to N1 + 1, we can nd pL (2) by looking for the minimum of the

542

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 7, NO. 3, JULY 2010

TABLE I F ORWARD EPWT A LGORITHM

Fig. 2.

Possible path vector in the detailed data of Fig. 3.

TABLE II I NVERSE EPWT A LGORITHM

Fig. 3. Reconstruction by the largest 1024 coefcients. (a) Original simulated seismic data. (b) Wavelet reconstruction. SNR = 1.52 dB. (c) Curvelet reconstruction. SNR = 3.79 dB. (d) EPWT reconstruction. SNR = 10.93 dB.

Fig. 1. Simple example to show the performance of easy path. (a) Path without storage cost. (b) Path that takes the data values into account.

given indexes pL (l 1) and pL (l), the next path component pL (l + 1) is taken in the path direction if |f (p(l)) f (p(l + 1))| < C. This means that we are no longer seeking for the optimal neighbor but prefer the rst neighbor (in the known direction) if the corresponding function value is good enough. Obviously, with a large bound C, we end up on a path vector like that in Fig. 1(a) as for the usual wavelet transform along the rows of the data and without adaptivity costs. The path pL in Fig. 2 has, now, a (relatively) small entropy since we need to store only a change of path direction. This can be done with small numbers. Moreover, we can take into account that each index occurs only once in the path vector such that the number of admissible neighbors of an index gets smaller. For further levels of the EPWT, similar strategies can be developed. For example, our experiments for the data in Fig. 3 with gray values between 0 and 255 have shown that pL can be stored with an entropy of about 0.8 bits per pixel (bpp) for C = 20, 0.4 bpp for C = 40, and 0.07 bpp for C = 100. IV. N UMERICAL E XPERIMENTS In this section, we demonstrate the good performance of the EPWT for seismic data compression, in comparison with

data, one can reduce these costs essentially if a large portion of the data does not contain essential information. The cost for the storage of pL can even fall away in this case for usual wavelet transform along data rows or columns [see Fig. 1(a)]. However, the EPWT is much more effective than the tensor product wavelet transform according to their ability to exploit the local correlations of the data. The path vector pL in Fig. 1(b) can be stored with minimal memory allocation. Compared to that in Fig. 1(a) with the usual path, where horizontal directions are preferred, we only need to store the direction from the fourth to the fth index (which is different from the usual way) and from the ninth to the tenth index. Then, the complete path vector in Fig. 1(b) is already determined by the specication that each index is contained exactly once in the path vector, and successive components are neighbored indexes. In Fig. 2, we show another example (a detail of Fig. 3), where the path vector pL of length 384 is chosen such that the previous direction of the path is preferred as long as the data values of the corresponding pixels do not differ too much. For that purpose, we can x a certain bound C and say that, for

MA et al.: NEW SPARSE REPRESENTATION OF SEISMIC DATA

543

Fig. 4. (a) Reconstruction using 20 000 coefcients in the bandlet transform. SNR = 2.50 dB. (b) Coefcients after sorting from large to small amplitudes. The solid, dotted, and dotted-dashed lines denote EPWT, wavelets, curvelets, respectively.

Fig. 5. (a) SNR versus number of reconstructed coefcients. (b) SNR versus percentage of reconstructed coefcients to total coefcients. The dotted, dotteddashed, and solid lines denote wavelets, curvelets, and EPWT, respectively.

wavelets and curvelets. In the experiments, we use the Daubechies DB4 wavelets [5] for the wavelet transform and the second-generation curvelets [2] for the discrete curvelet transform. Fig. 3(a) shows a typical simulated 2-D shot gather with 128 128 size. Fig. 3(b)(d) shows the results reconstructed by using the largest 1024 coefcients of wavelets, curvelets, and EPWT (using the 1-D DB4 wavelet lter bank), respectively. The EPWT provides a much better sparse representation than the wavelets and curvelets, in terms of both signal-to-noise ratio (SNR) values and preservation of wavefronts. Due to the high redundancy of curvelets, the xed-number reconstruction has a lower SNR value than the wavelet reconstruction. Unfortunately, wavelet reconstructions result in serious oscillating artifacts (mosaic phenomenon) along the edges, due to its poor ability to analyze curve singularities. Fig. 4(a) shows a comparing result by bandlet transform. We select the largest 20 000 bandlet coefcients in this case (and hence, more than the 16 384 coefcients of the original image). Taking only 1024 coefcients for the reconstruction does not result in visual geometric features. Although the bandlet transform works along geometric edges, it cannot capture enough features by these coefcients because of its large redundancy. Another similar transform is grouplet transform that performs a wavelet transform along a geometric path. However, the current grouplet version is only for Haar transforms, and there are, again, essential adaptivity costs since the association eld has to be stored in each level. Fig. 4(b) shows the coefcients after sorting for different transforms, showing that the EPWT coefcients have the fastest decay. Fig. 5 shows the change of SNR of the reconstructed data as the number of coefcients increases. The EPWT displays a good ability for sparse representation of seismic data because the EPWT can nd the pathways easily along the wavefronts. Even using the same percentage of coefcients (where the curvelet transform uses almost three times as many coefcients), as shown in Fig. 5(b), the EPWT displays similar SNR values as curvelets are known to be optimal for sparse representation of data containing smooth discontinuities [1]. In Fig. 6, we test the methods on a real seismic gather. The original size of the data is 1125 1511, i.e., the vertical time coordinate (04.5 s) indicates 1125 samples in each trace, and the horizontal spatial coordinate indicates 1511 traces. The interval between traces is 4 ms. In our test, we take

Fig. 6. Reconstruction of real seismic data by using 1024 coefcients. (a) Original data. (b) Curvelet reconstruction. SNR = 0.51 dB. (c) Wavelet reconstruction. SNR = 2.57 dB. (d) EPWT reconstruction. SNR = 12.51 dB.

a central part of the data with size 256 256, as shown in Fig. 6(a). The vertical axis corresponds to time, and the horizontal axis corresponds to the position of the receivers. We consider the reconstructions by using the largest 1024 coefcients of curvelets, wavelets, and EPWT. The EPWT displays outstanding superiority for the edge-preserving sparse reconstruction. Fig. 7(a)(c) shows the differences between the input data and the reconstructed data by using the curvelets, the wavelets, and the EPWT, respectively. Fig. 7(d) shows a comparison of central traces taken from Fig. 6(a)(d). It can be seen that the EPWT method preserves the amplitude of the data well, which is important for practical application. Fig. 8 shows the ability of the EPWT method for compressing seismic data with aliased dips. In order to get aliased data with geometrical aliasing, we select, for example, one trace over ten from the original input data with 1511 traces. The new input data now consist of 152 traces. Fig. 8(a) shows the data with aliased dips, where the wavefronts associated to large dips are not continuous. The sampling in the horizontal axis indicates the selected traces over ten, i.e., 1, 11, . . . , 1511. Fig. 8(b)(d) shows the compressed results by reconstructing the 1024 largest coefcients of curvelets, wavelets, and EPWT, respectively. Once more, the EPWT better works for the

544

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 7, NO. 3, JULY 2010

step would be to develop a high-dimensional EPWT algorithm for sparse representation of large volumes of data depending on possibly ve dimensions: time and two spatial coordinates for both sources and receivers. The EPWT can be widely applied to other elds such as satellite data compression and image processing. An alternative strategy to improve the high computation cost of the EPWT is to apply the EPWT in wavelet subbands. From experiments, this strategy-based method only needs half computational time for a 256 256 data. R EFERENCES
[1] E. Cands and D. Donoho, New tight frames of curvelets and optimal representations of objects with piecewise C 2 singularities, Commun. Pure Appl. Math., vol. 57, no. 2, pp. 216266, Feb. 2004. [2] E. Cands, L. Demanet, D. Donoho, and L. Ying, Fast discrete curvelet transforms, Multiscale Model. Simul., vol. 5, no. 3, pp. 861899, Sep. 2006. [3] H. Chauris and T. Nguyen, Seismic demigration/migration in the curvelet domain, Geophysics, vol. 73, no. 2, pp. S35S46, Mar./Apr. 2008. [4] R. Claypoole, G. Davis, W. Sweldens, and R. Baraniuk, Nonlinear wavelet transforms for image coding via lifting, IEEE Trans. Image Process., vol. 12, no. 12, pp. 14491459, Dec. 2003. [5] I. Daubechies, in Ten lectures on wavelets, Philadelphia, PA, 1992. [6] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM J. Numer. Anal., vol. 43, no. 2, pp. 707732, 2006. [7] M. Do and M. Vetterli, The contourlet transform: An efcient directional multiresolution image representation, IEEE Trans. Image Process., vol. 14, no. 12, pp. 20912106, Dec. 2005. [8] H. Douma and M. de Hoop, Leading-order seismic imaging using curvelets, Geophysics, vol. 72, no. 6, pp. S231S248, Nov./Dec. 2007. [9] S. Fomel, Towards the seislet transform, in Proc. SEG, New Orleans Annu. Meeting, 2006, pp. 28472851. [10] K. Guo and D. Labate, Optimally sparse multidimensional representation using shearlets, SIAM J. Math. Anal., vol. 39, no. 1, pp. 298318, 2007. [11] G. Hennenfent and F. Herrmann, Seismic denoising with nonuniformly sampled curvelets, IEEE Comput. Sci. Eng., vol. 8, no. 3, pp. 1625, May/Jun. 2006. [12] F. Herrmann and G. Hennenfent, Non-parametric seismic data recovery with curvelet frames, Geophys. J. Int., vol. 173, no. 1, pp. 233248, Apr. 2008. [13] F. Herrmann, D. Wang, and D. Verschuur, Adaptive curvelet-domain primary-multiple separation, Geophysics, vol. 73, no. 3, pp. A17A21, May/Jun. 2008. [14] J. Krommweh, Tetrolet transform: A new adaptive Haar wavelet algorithm for sparse image representation, J. Vis. Commun. Image Representation, 2010, to be published. [15] E. Le Pennec and S. Mallat, Sparse geometric image representations with bandelets, IEEE Trans. Image Process., vol. 14, no. 4, pp. 423438, Apr. 2005. [16] J. Ma and G. Plonka, A review of curvelets and recent applications, IEEE Signal Process. Mag., vol. 27, no. 2, 2010, to be published. [17] S. Mallat, Geometrical grouplets, Appl. Comput. Harmon. Anal., vol. 26, no. 2, pp. 161180, Mar. 2009. [18] R. Neelamani, A. Baumstein, and D. Gillard, Coherent and random noise attenuation using the curvelet transform, Leading Edge, vol. 27, no. 2, pp. 240248, Feb. 2008. [19] G. Plonka, The easy path wavelet transform: A new adaptive wavelet transform for sparse representation of two-dimensional data, Multiscale Modelling Simul., vol. 7, no. 3, pp. 14741496, 2009. [20] G. Plonka and D. Rosca, Easy path wavelet transform on triangulations of the sphere, Math. Geosci., to be published. [21] G. Plonka, S. Tenorth, and A. Iske, Optimally Sparse Image Representation by the Easy Path Wavelet Transform, 2009. preprint. [22] H. Shan, J. Ma, and H. Yang, Comparisons of wavelets, contourlets, and curvelets in seismic denoising, J. Appl. Geophys., vol. 69, no. 2, pp. 103 115, Oct. 2009. [23] B. Sun, J. Ma, H. Chauris, and H. Yang, Solving the wave equation in the curvelet domain: A multi-scale and multi-directional approach, J. Seismic Exploration, vol. 18, pp. 385399, 2009. [24] A. Vesnaver, Yardsticks for industrial tomography, Geophys. Prospecting, vol. 56, no. 4, pp. 457465, Jul. 2008.

Fig. 7. Reconstructed errors by (a) curvelets, (b) wavelets, and (c) EPWT. (d) Comparison of central traces from reconstructed results by different methods, as shown in Fig. 6. The dotted lines denote the original data. The solid lines, from left to right, denote curvelet, wavelet, and EPWT reconstruction.

Fig. 8. Compression of data with aliased dips. (a) Original data with aliased dips (missed traces). (b) Curvelet compression. SNR = 6.81 dB. (c) Wavelet compression. SNR = 0.45 dB. (d) EPWT compression. SNR = 7.85 dB.

compression of data even with aliased dips. The signal is well preserved in the compressed result. In all the aforementioned comparisons, we only consider sparse approximation (without coding step) for data compression. The additional coding step can gain the performance of compressive methods naturally. V. C ONCLUSION In this letter, we have applied a new adaptive EPWT for sparse representation and compression of seismic data. As a rst step, numerical results for 2-D (t, x) seismic data show good performances of the EPWT method in this eld. The next

Вам также может понравиться