Вы находитесь на странице: 1из 7

SIMILARITY SEARCH OVER TIME-SERIES DATA USING WAVELETS: ECG WAVE FORM

Mayank Vibhuti Jha, Hemant Singh, Anurag Agrawal, Parameshwar B. Birajdar, Sajith.K Department of Electrical Engineering, IIT Bombay

Abstract
Discrete Wavelet Transform (DWT) is a powerful tool used for removal of noise from a natural time series data. Also, wavelets play a significant role in recognition of different parts of the signal simultaneously in timefrequency domain. A time series is a real valued sequence, which represents the status of a single variable over time. To illustrate this concept following work has been done taking output waveform of Electrocardiogram (ECG), as a time series data for processing. Daubechies wavelet (db6) has been used as an instrument for feature extraction of the ECG data. The group has tried its best to find features related to some abnormalities of an arrhythmic heart. The data set used was taken from the MIT-BIH Arrhythmia database.

Dimensionality reduction
In this work we have used ECG time series data to detect the anomalies in the Electrocardiogram. For any given use of time series data, the analysis usually involves queries, which express the notion of similarity as perceived by the user. One approach to determine similarity between two sequences of same length is to use Euclidean distance. Two sequences are said to be similar if their Euclidean distance is less than a user defined threshold. In time series databases, the query length determines the dimensionality of the index [1]. Feature extraction process leads to a dimensionality reduction. Feature extraction is performed by applying a transform to each input sequence and then keeping only a subset of the coefficients. As a result each sequence of length N is mapped into a point in Nf dimensional feature space N f<<N.

Introduction
Extensive growth in the field of computers has allowed us to store data received from various sensors and monitoring systems. This database can be used in the future for referencing purpose. Huge databases are considered as invaluable sources of information by decision makers. The accumulation of data has taken place at an explosive rate; hence extraction of relevant data can be carried out using different techniques. In this report, one such method has been verified for extraction of data using wavelet transform. Hidden pattern in the waveform can be located by constructing a master feature vector for the required data. Objective of the methodology was to remove the possibility of false negatives along with sensitivity to positives. Same can be used for: Getting better insight into data and analyzing the trends to determine and evaluate the current state of interest Making realistic predictions, thus advancing the strategic planning process Detecting anomalies

Feature Extraction of ECG time series data


The ECG is a graphic record of the direction and magnitude of the electrical activity that is generated by depolarization and repolarisation of the atria and ventricles. One cardiac cycle in an ECG signal consists of the P-QRS-T Waves. Most of the clinically useful information in the ECG is found in the intervals and amplitudes defined by its features (characteristic wave peaks and time durations). Feature extraction reduces the dimensionality of the long records. ECG feature extraction system gives features for similarity search in long records of ECG data and thus helping to detect anomalies present in the Heart.

a wavelet which will very efficiently represent a signal of interest in a large variety of applications. Wavelet families include Biorthogonal, Coiflet, Haar, Symmlet, Daubechies wavelets. There is no absolute way to choose a certain wavelet. The choice of the wavelet function depends on the application. Selecting a wavelet function which closely matches the signal to be processed is of utmost importance in wavelet applications. Daubechies wavelet family is similar in shape to QRS complex and their energy spectrum is concentrated around low frequencies [2].

Description of the algorithm


Fi g 1: Normal ECG Signal and i ts va rious components

First of all, base line drift present in the original signal was removed using median filters. Median filters of length 200 ms and 600 ms have been used to remove the drift present in input data base for correction of baseline. Next step after base line removal was to remove the noise present above 40 Hz as band of frequency for ECG signal lies from sub Hz frequency to maximum approximately 40 Hz. ECG waveform taken as input has frequency rising from sub Hertz to 180 Hz. So reconstruction of the signal from its approximation and details was carried out in such a way so that noise was made out of bound. Developing an algorithm for the detection of the P wave, QRS complex and T wave in ECG is a difficult problem due to the time varying morphology of the signal subject to physiological conditions [2]. We have used Daubechies wavelet for detection of these features.

The algorithms presented in this section are applied directly at one run over the whole digitized ECG signal which is saved as data files. The description of ECG feature extraction algorithm is shown in Fig. 1. First, the peak of the QRS complex with its high dominated amplitude in the signal is detected. Then Q and S waves are detected. Location of maxima P and T waves are the last things to be found. Basic flow chart for the detection of features has been shown in Fig. 2.

R Detection
If we see the details D1 and D2 they contain only noise, so for reconstruction of signal, these details were left out. Details 23 25 were kept because in noise removed signal this is the portion which has the next high frequencies. Rest all the details were removed while carrying out location of R. This procedure removed low frequencies and high frequencies. High amplitude transitions of the signal were then more noticeable, even if R peaks were deformed. Then a practically lower limit is applied on the signal to remove false noisy peaks. Because no two beats happen in an interval less than 0.25 second, pseudo-beats were also removed. Detection of R peaks is very important because they define the cardiac beats and the exactness of all forthcoming detection is dependent on this.

Discrete Wavelet Transform


ECG signals are non-stationary signals and can be analyzed in time domain, in frequency domain and in time-frequency domain. The wavelet transform is a powerful tool for analyzing non-stationary signals. Wavelet analysis is capable of providing the time and frequency information simultaneously and giving a time-frequency representation of signals. Wavelet analysis is a form of multi-resolution analysis. We can choose various time-frequency features for the ECG analysis system. The discrete wavelet transform (DWT) is used to decompose hierarchically discrete time signals into a series of successively lower resolution approximation signals and their associated detail signals. The large number of known wavelet families and functions provides a rich space in which to search for

Fig 3: Algorithm for R detection 2

6.

7. LOCATING P-T MAXIMA USING D5D8 DETAILS

1.
INPUT ECG SIGNAL

RECONSTRUCTION LOCATING Q-S POINTS ON SIGNAL CONSTRUCTED USING D5-D8

8. 5. 2.

BASE LINE DRIFT REMOVAL

LOCATING R-PEAKS USING D3-D5 DETAILS AND FINDING HEARTBEAT RATE

CONSTRUCTING FEATURE VECTOR USING TIME INTERVALS OF DIFFERENT LOC

3.
CALCULATION OF DB6 APPROXIMATION AND DETAILS

4.
NOISE REMOVAL (EXCLUSION OF D1 & D2)

9.

PREPARATION OF QUERY VECTOR OF ABNORMALITY

10. COMPARISION OF QUERY VECTOR WITH TIME SERIES DATA FOR LIKELY DETECTION OF ABNORMALITY IN THE INPUT SIGNAL Figure 2: General block diagram for the complete process

QS Detection
In order to make the peaks noticeable, all the details of the signal were removed up to detail 25. The approximation signal remained, which was searched for farthest points around 25 percent of time length of single heartbeats about the R peaks formerly detected. The left point denoted the Q peak and the right one denotes the S peak. The procedure of the Q and S wave detection algorithm is shown in Fig. 4

abnormality is calculated. It has to be kept in mind that this abnormality feature vector does not give any false negatives. Minimizing the number of false positives will dictate the accuracy and sensitivity of the algorithm. Here in this assignment Premature Arterial Contraction (PAC) and Premature Ventricular Contraction (PVC) were taken as abnormalities and similarity search for the probable PAC and PVC pulses were carried out.

CONCLUSION
Presently all results are based on signal received from MLII lead only out of 12 lead system of ECG. In depth survey of abnormalities can be carried out using all 12 lead data of Electrocardiogram, as only specific details are more pronounced in each lead and rest of the details are subsided. This will improve on the sensitivity of the system towards various features of the signal. The same mechanism with some amount of changes in the procedure can be used to carry out probable prediction of the futuristic behaviour of the arrhythmic heart which may help the medical authorities to take timely decisions.

Fig 4: Algorithm for Q&S detection

P and T wave Detection


These waves are more noticeable when keeping details 24 28 as they are low frequency (slow varying) components in the signal. At these levels, high frequency ripples of the signal are removed. The points with maximum amplitude on time length with in a distance of 25 percent of time length of single pulse before and after Q and S locations respectively are detected as P and T. The flow of the P and T wave detection algorithm is shown in Fig. 5

Fig 5: Algorithm for P and T detection

Feature Extraction Using time lengths RP, RQ, RR, RS, RT and QRS of all the pulses mean and variance for the same has been calculated which is used as feature vector for the normal heartbeat. Similarly depending upon the type of abnormality and finding its various features from the literature approximate feature vector for the 4

MATLAB SIMULATION AND RESULTS

D1-D8 DETAILS TIME DOMAIN

D1-D8 DETAILS (D1-D2 SHOWS PRESENCE OF NOISE OVER 45 Hz.

References:

[1] Ivan Popivanov, Renee J. Miller, Similarity search over time series data using Wavelets, Proceedings of the 18th International conference on Data Engineering(ICDE2) 20002 IEEE. [2]. S.Z Mohmoodabadi, A .Ahmadian, M.D. Abolhasani, ECG Feature extraction using Daubechies wavelets, [3] Mohamed O. Ahmed Omar, Nahed H. Solouma, Yasser M. Kadah, Morphological characterization of ECG signal abnormalities: a new approach, Proc. cairo international biomedical engineering conference 2006. [4] www.physionet.org

Вам также может понравиться