Академический Документы
Профессиональный Документы
Культура Документы
ABSTRACT: Acoustic Emission (AE) has seen increased popularity in applications involving machine condi-
tion monitoring. AE applications usually involve higher sampling rate than vibration signals, not rare reaching
2 MHz. One of the main challenges involving AE based fault diagnosis is the need of preprocessing massive
amounts of data generated by this technique, including engineering of appropriate features and dimensionality
reduction so to be able to handle such massive datasets. In this paper, we propose a novel method based on
Deep Convolutional Neural Networks (CNN) to handle raw AE signals for diagnosis of a system’s health
states. This method is flexible enough to not only handle the massive amount of AE data, but also to provide
the means for automatic feature extraction by applying various filters to the raw AE signals, and thus identify-
ing relevant frequencies related to different faults. The proposed CNN method is applied to fatigue crack de-
tection on blades of an experimental rotor.
1 INTRODUCTION The remainder of the paper is structured as fol-
lows. Section 2 introduces deep learning and CNNs
Unscheduled maintenance of mechanical systems and their architectural building blocks. Then, Sec-
leads to loss of production and might as well affect tion 3 discusses the proposed method, application
safety. There are many ways to minimize this effect, and validation for fault diagnosis of an experimental
some of them are: increase redundancy, programed rotor and compares its performance to a fully opti-
maintenance to identify problems that could incur in mized shallow neural network. Section 4 presents
extended downtimes and, condition monitoring some concluding remarks.
(Rabiei, Droguett and Modarres, 2016).
A popular approach to condition monitoring is
vibration analysis. However, vibration monitoring is 2 CONVOLUTIONAL NEURAL NETWORKS
usually less sensitive to detecting damages already
developed, which pose a significant limitation in
2.1 Artificial Neural Networks
sensitive systems. On the other hand, Acoustic
Emission techniques are gaining grounds because Artificial Neural Networks (ANNs) are com-
they can identify damage at early stages, with the prised of simple units called neurons. Each of these
tradeoff of introducing higher sample rates resulting neurons creates a linear combination for an input (x)
in massive and higher data dimensionality. between a weight (w) and a bias (b) parameters that
Moreover, both AE and vibration monitoring re- are learned by the ANN.
quire signal preprocessing and interpretation such as Also, an activation function (f) adds the nonlinear
wavelets, fast Fourier transform and band filtering behavior that allows to compute nontrivial respons-
among others (Riaz et al., 2017), a labor intensive es. Then, the output (O) of a neuron is computed by
and expensive endeavor requiring specialized engi- Equation (1):
neering expertise.
Machine learning techniques have become a pop- 𝑂 𝑥 = 𝑓(𝑤𝑥 + 𝑏) (1)
ular choice for fault diagnosis and prognosis. Most
of these shallow models heavily rely on manual fea-
ture identification and extraction (Ruiz-Gonzalez et 2.2 Deep Learning
al., 2014; Kane and Andhare, 2016; Li et al., 2016). Simply put, deep learning is a branch of the Ma-
As discussed in (Verstraete et al., 2017), the per- chine Learning that uses many hidden layers to per-
formance of these methods is dependent on the qual- form the learning. The deep learning based networks
ity of the hand-engineered features, which obviously learn multiple features over the features learned by
requires significant understanding of the system’s previous layers, integrating the concept of hierarchy
degradation processes. between features implying different levels of ab-
To tackle these challenges, we propose a deep straction (Deng and Yu, 2014). This is important to
CNN-based method for fault diagnosis that operates achieve high accuracy in tasks that have complex re-
on massive raw acoustic signals and allows for the lationship among data such as image recognition and
automatic hierarchical “layer to layer” feature ex- signal processing.
traction to learn complex representations of the data.
2.3 Convolutional Neural Network Overview 2.3.3 Pooling Layer
Convolutional Neural Networks (CNNs) are a Pooling layers are used to reduce the dimension
type of Neural Networks that are specialized for of the input and achieve spatial invariance. This is
processing grid-topology data (Goodfellow, Bengio usually accomplished by taking the maximum value
and Courville, 2017). CNNs have been shown to of a pooling window and switching it for all values
outperform shallow architectures in many image in that window. This lowers the resolution but taking
recognition tasks and have been applied to vibration only the most important feature.
based fault diagnosis (Verstraete et al., 2017).
The main characteristics of the CNNs are that the 2.3.4 Activation Function
layers have sparse connectivity and parameter shar- As discussed before, activation functions add
ing. The first characteristic means that CNNs use fil- nonlinear behavior to the network. In the proposed
ters that are considerably smaller than the input im- CNN architecture, we employ the Rectified Linear
plying that the filters store less parameters than a Units (ReLUs) as activation function for the convo-
shallow neural network and detect important fea- lutional layers as they provide increased sparsity
tures of the input. The second one implies that the compared with Tanh or Sigmoid activation func-
filter weights in a convolutional layer are used mul- tions, thus decreasing computation time (Maas,
tiple times across the input resulting in computation- Hannun and Ng, 2013). The commonly used ReLU
ally efficient matrix multiplication. is shown in Equation (5):
2.3.1 Convolutional Layer
𝑔 𝑥 = max (0, 𝑥) (5)
As we are dealing with raw AE data, the convolu-
tion operation in the proposed model is also 1D.
For the fully connected layers (see Section 2.3.5),
This means that the filters (w(t)) learned by the net-
the softmax activation function is used to quantify
work are in the time domain and generate a filtered
the probability of a sample to correspond to a given
signal of the input (x(t)) highlighting features that
system’s health state. This function is displayed in
represent the system’s health state. The output signal
Equation (6):
of a 1D convolution, s(t), is computed as shown in
Equation (2):
𝑒 𝒂H
𝑓D (𝒛) = I 𝒂H
(6)
D15 𝑒
/
𝑠 𝑡 = 𝑥∗𝑤 𝑡 = 𝑥 𝑎 𝑤(𝑡 − 𝑎) (2)
012/
(9)
𝑥 (9) − 𝐸 𝑥 (9) 𝐿𝑜𝑠𝑠 = − 𝑝 𝑥 log (𝑞 𝑥 ) (7)
𝑥 = (3)
𝑉𝑎𝑟 𝑥 (9) N
However, the implementation of de-noising shaped before being fed to the last two fully con-
methods incurs in loss of information and encom- nected layers, each with 1024 neurons, that are re-
passes pre-processing time that we want to avoid sponsible for processing the features obtained from
and handle with the proposed architecture. the convolutional layers to perform fault diagnosis.
Moreover, based on the raw signals and the am- The proposed CNN method is trained for 15000
plitude spectrums shown in Figure 3, the health con- epochs, where one consists of all training samples.
ditions are remarkably similar that, coupled with the The CNN is regularized via dropout for the fully
signal noise levels, makes this dataset a significant connected layers with 50% of keep probability and
challenging diagnosis task. L2 regularization (see Section 2.3.7) as well as early
stopping by saving the best epoch in terms of accu-
racy and generalization capability (train loss remain-
3.2 Proposed Deep CNN Architecture ing low as test loss decreases).
The proposed CNN architecture, processing Also of note is that the proposed CNN architec-
batches of 256 samples, consists of five convolu- ture does not have pooling layers. There are two un-
tional layers as follows (see Figure 4): the first con- derlying reasons: firstly, the proposed CNN method
volutional layer has 32 oversized filters of 128x1 de- is not required to achieve spatial invariance as it
signed to tackle background noise in the acoustic deals with raw acoustic emission signals; secondly,
emission raw signal; this is followed by four convo- the CNN marginally improved (in terms of accuracy
lutional layers with 32, 32, 64 and 128 filters of size and generalization) by the reduction in size of the
3x1, respectively, which are designed to automati- feature maps resulting from the polling layers or,
cally and hierarchically extract features from the AE conversely, its performance deteriorated due to the
data. The last convolutional layer’s output is re-
Figure 4 Architecture of the CNN
The proposed CNN method is compared with a Moreover, the unnormalized and normalized con-
shallow ANN that has the same two fully connected fusion matrices are shown in Table 4 and Figure 5,
layers, but lacks the convolutional layers. This al- respectively.
lows us to assess the impact on the fault diagnosis
performance of the convolutions as signal filtering Table 4 Confusion Matrix for the proposed CNN method.
and de-noising tool as wells as the quality and ro-
bustness of the extracted features. This ANN is fully 5 [mm] 20 [mm] Undamaged
optimized with ADAM adaptive gradient-based op- 5 [mm] 363 35 6
timization algorithm and regularized via dropout 20 [mm] 28 382 0
(with 50% keep probability), weight regularization Undamaged 8 0 402
for both hidden layers and early stopping.
Table 2 shows the overall test fault diagnosis ac-
curacy. The proposed CNN method significantly
outperforms the shallow ANN in terms of accuracy
and generalization capacity, with the ANN barely
learning from the complex AE dataset.
Based on these results, the proposed CNN method
outperforms the shallow ANN for the rotor’s fault
diagnosis based on acoustic emission monitoring.
This is corroborated by observing Figure 6 a) and c)
that the CNN presents a monotonically descendent
testing loss behavior that leads to improvement in
the fault diagnosis accuracy. But, it should be ob-
served that that the accuracy improvement to time
ratio for the CNN is very low for the last epochs
even though the network still learns. This could be
driven by a very low learning rate for these epochs
as ADAM adapts this hyperparameter.
However, as shown in Figure 6 b) and d), the
ANN barely learns from the raw AE data, which
could be attributed to the complexity of the data as
well as the meaningless features that the ANN ex-
tracts by treating the signals as independent points,
problem that seems to be compensated by the convo-
lutional filters in the CNN.
However, the superior performance achieved by
the proposed CNN method comes at a much higher
computational cost due to the significant number of
Figure 5 Normalized confusion matrix for the proposed CNN learnable parameters and hyperparameters leads to
method.
extended training times.
Figure 6 a) Accuracy behavior of the CNN, b) Accuracy behavior of the ANN, c) Loss behavior of CNN and d) Loss behavior of the ANN
4 CONCLUSIONS ‘Rectifier Nonlinearities Improve Neural Network
Acoustic Models’, Proceedings of the 30 th
This paper has introduced a new deep CNN-based International Conference on Machine Learning, 28,
method for fault diagnosis using raw acoustic emis- p. 6. Available at:
sion signals. The application of this method to an https://web.stanford.edu/~awni/papers/relu_hybrid_i
experimental rotor has shown that the proposed cml2013_final.pdf.
Peng, H. et al. (2015) ‘A Comparative Study on
method delivers satisfactorily performance metrics Regularization Strategies for Embedding-based
for health state diagnosis. The CNN method was al- Neural Networks’, (1). Available at:
so compared to a fully optimized ANN, with the http://arxiv.org/abs/1508.03721.
former significantly outperforming the shallow Rabiei, E., Droguett, E. L. and Modarres, M.
method. (2016) ‘A prognostics approach based on the
These solid results in fault diagnosis are mainly evolution of damage precursors using dynamic
due to the CNN’s ability to automatically extract Bayesian networks’, Advances in Mechanical
features from and efficiently handle the noisy acous- Engineering, 8(9), p. 168781401666674. doi:
tic emission signals. This also brings major ad- 10.1177/1687814016666747.
vantages to the development of automated monitor- Riaz, S. et al. (2017) ‘Vibration Feature
ing and fault diagnosis tools such as the possibility Extraction and Analysis for Fault Diagnosis of
to bypass the intervention of the human element in Rotating Machinery-A Literature Survey’, Asia
Pacific Journal of Multidisciplinary Research,
the labor-intensive feature engineering process and 5(51), pp. 103–110. Available at: www.apjmr.com.
reducing the need for preprocessing and de-noising Ruiz-Gonzalez, R. et al. (2014) ‘An SVM-Based
of acoustic emission signals. Based on these prelim- classifier for estimating the state of various rotating
inary results, the proposed CNN method is a promis- components in Agro-Industrial machinery with a
ing tool for fault diagnosis. vibration signal acquired from a single point on the
machine chassis’, Sensors (Switzerland), 14(11), pp.
ACKNOWLEDGMENTS 20713–20735. doi: 10.3390/s141120713.
The authors acknowledge the partial financial sup- Srivastava, N. et al. (2014) ‘Dropout: A Simple
port of the Chilean National Fund for Scientific and Way to Prevent Neural Networks from Overfitting’,
Technological Development (Fondecyt) under Grant Journal of Machine Learning Research, 15, pp.
No. 1160494. 1929–1958. doi: 10.1214/12-AOS1000.
Verstraete, D. et al. (2017) ‘Deep Learning
Enabled Fault Diagnosis Using Time-Frequency
Image Analysis of Rolling Element Bearings’, 2017,
REFERENCES pp. 1–29.
Deng, L. and Yu, D. (2014) ‘Deep Learning:
Methods and Applications’, Foundations and
Trends® in Signal Processing, 7(3–4), pp. 197–387.
doi: 10.1561/2000000039.
Glorot, X. and Bengio, Y. (2010) ‘Understanding
the difficulty of training deep feedforward neural
networks’, 9, pp. 249–256.
Goodfellow, I., Bengio, Y. and Courville, A.
(2017) Deep Learning. doi: 10.1007/s00287-016-
1013-2.
Ioffe, S. and Szegedy, C. (2015) ‘Batch
Normalization: Accelerating Deep Network Training
by Reducing Internal Covariate Shift’. doi:
10.1007/s13398-014-0173-7.2.
Kane, P. and Andhare, A. (2016) ‘Application of
psychoacoustics for gear fault diagnosis using
artificial neural network’, Journal of Low Frequency
Noise, Vibration and Active Control, 35(3), pp. 207–
220. doi: 10.1177/0263092316660915.
Kingma, D. P. and Ba, J. (2014) ‘Adam: A
Method for Stochastic Optimization’, pp. 1–15. doi:
http://doi.acm.org.ezproxy.lib.ucf.edu/10.1145/1830
483.1830503.
Li, C. et al. (2016) ‘Fault diagnosis for rotating
machinery using vibration measurement deep
statistical feature learning’, Sensors (Switzerland),
16(6). doi: 10.3390/s16060895.
Maas, A. L., Hannun, A. Y. and Ng, A. Y. (2013)