Академический Документы
Профессиональный Документы
Культура Документы
com
Available online at www.sciencedirect.com
ScienceDirect
Available ScienceDirect
online
Procedia
Available atonline
CIRPwww.sciencedirect.com
00 (2018) 000–000
at www.sciencedirect.com
www.elsevier.com/locate/procedia
ScienceDirect
Procedia CIRP 00 (2018) 000–000
ScienceDirect www.elsevier.com/locate/procedia
1. Introduction
Keywords: Assembly; Design method; Family identification occurrence in the current gearbox design to be used for
1. Introduction redesign
occurrence towards
in thelonger
current life gearbox
cycle, as design
shown in to Fig. 1.
be used for
Achieving sustainability in manufacturing is of ever- redesign towards longer life cycle, as shown in Fig. 1.carried
Fault diagnosis of gearboxes has traditionally been
increasing
Achieving importance to meetinthemanufacturing
sustainability growing demands is ofof ever-
strict outFault
by analyzing
diagnosissensor measurements
of gearboxes (e.g. vibration)
has traditionally in the
been carried
1.environmental
Introduction and occupational safety regulations and of time-frequency
the product range and characteristics
domain, where fault manufacturedfeatures
characteristic and/or
increasing importance to meet the growing demands of strict out by analyzing sensor measurements (e.g. vibration) in the
diminishing natural assembled in this system. In thisthecontext, the health
main challenge in
environmental and resources [1]. Assafety
occupational gearboxes are a critical
regulations and are extracted
time-frequency to reveal
domain, where gearbox
fault characteristic condition,
features
Due to ofthe
component power fasttransmission
development systemsin in themost domain
modern of modelling and
abnormalities, analysis is now not only to cope with single
diminishing natural resources [1]. As gearboxes are a critical are extracted and to fault
reveal typesthe[2].gearbox
However, the performance
health condition,
communication
manufacturing and an consideration
systems, ongoing trendofofthedigitization
life cycle and
of products,
of the a limited product
time-frequency rangeheavily
analysis or existingreliesproduct
the families,
on performance
accuracy
component of power transmission systems in most modern abnormalities, and fault types [2]. However, the
digitalization,
gearboxes is manufacturing
expected to be a enterprises
key strategy are
forfacing
enhancingimportant
the butand
also to be able to analyze and
comprehensiveness of tophysical
compare products
knowledge to define
(e.g.
manufacturing systems, consideration of the life cycle of of the time-frequency analysis heavily relies on the accuracy
challenges
sustainability in of
today’s market environments:
manufacturing. Specifically, a continuing
the accurate new product families.
frequency responses Itofcan be observed
certain fault that classical
types). In addition,existing
time-
gearboxes is expected to be a key strategy for enhancing the and comprehensiveness of physical knowledge (e.g.
tendency
and timelytowards reduction
diagnosis of gearof faults
product development
is not only the timesfor
essential and product families are regrouped
frequency in function of clients ortypes
features.
sustainability of manufacturing. Specifically, accurate frequency analysis
responsescannot reliably
of certain faultdistinguish fault
types). In addition, and
time-
shortened
avoiding product
unplanned lifecycles. In
maintenance addition,
or there
shutdowns,is an increasing
but also However,
severity assembly
levels iforiented
the product
faults share families
the are
same hardly to find.
characteristic
and timely diagnosis of gear faults is not only essential for frequency analysis cannot reliably distinguish fault types and
demand
provides of opportunities
customization,for being
reuseat the same
or shutdowns,time in a global
remanufacturing of On the product
frequency. family level,
Recently, products differ mainly
artificial in two
avoiding unplanned maintenance or but also severity levels if the faults shareintelligence algorithms,
the same characteristic
competition
faulted with competitors all over the world. This trend, main characteristics: (i) the number of components and (ii) the
providesgears, as well as obtaining
opportunities for reuseuseful information on fault
or remanufacturing of especially
frequency. Deep Neuralartificial
Recently, Networks (DNNs), algorithms,
intelligence have been
which is inducing the development from macro to micro type of components (e.g. mechanical, electrical, electronical).
faulted gears, as well as obtaining useful information on fault especially Deep Neural Networks (DNNs), have been
markets, results in diminished lot sizes due to augmenting Classical methodologies considering mainly single products
2212-8271 © 2019 The Authors. Published by Elsevier
product varieties (high-volume to low-volume production) [1]. B.V. This is an open access article under thealready
or solitary, CC BY-NC-NDexistinglicense
product families analyze the
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
To cope
2212-8271 with
© this
2019 Theaugmenting
Authors. variety
Published by as well
Elsevier as
B.V. to
Thisbeis able
an opento
accessproduct
article structure
under the CC on a
BY-NC-ND
Peer-review under responsibility of the scientific committee of the 26th CIRP Life Cycle Engineering (LCE) Conference.
physical level
license (components level) which
identify possible optimization potentials
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
doi:10.1016/j.procir.2017.04.009 in the existing causes difficulties regarding an efficient definition and
Peer-review under
production responsibility
system, of the scientific
it is important to have committee
a precise of the 26th CIRP Life Cycle
knowledge Engineering
comparison of(LCE) Conference.
different product families. Addressing this
doi:10.1016/j.procir.2017.04.009
2212-8271 © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/3.0/)
2212-8271 © 2017 The Authors. Published by Elsevier B.V.
Peer-review
Peer-review under
under responsibility
responsibility of scientific
of the the scientific committee
committee of the of theCIRP
28th 26thDesign
CIRP Conference
Life Cycle 2018.
Engineering (LCE) Conference.
10.1016/j.procir.2018.12.008
John Grezmak et al. / Procedia CIRP 80 (2019) 476–481 477
2 Grezmak et al./ Procedia CIRP 00 (2019) 000–000
and bias blj in the lth layer, M l 1 denotes all feature maps in interpret how pixels in the input image affect the classifier’s
the (l – 1)th layer, and is the convolution operator. Along decision on fault type and severity for that input.
the network propagation, the lower-level convolutional LRP is performed in a top-down manner with respect to
layers extract lower-level features (e.g. edges and curves), the classifier structure, starting with a relevance score at the
and higher-level layers extract higher-level features (i.e. output layer, which can be taken as a real-valued prediction
combinations of the lower level features, motifs). output of the classifier corresponding to a specific class, and
Each convolution layer is typically followed by a pooling propagating relevance scores to the input pixels. The
layer, which down-samples the outputs of the preceding relevance is propagated such that the sum of relevance scores
convolution layer. To perform down-sampling, each feature in each layer is constant:
map is subject to region-wise pooling operations that act on
N l x N l sized non-overlapping regions of the feature map. f ( x) R
d l 1
( l 1)
d Rd(l ) Rd(1)
d l d
(2)
Typical pooling operations include calculating the mean or
maximum value in the N l x N l regions. where f(x) is the real-valued prediction output, and Rld is the
After the final convolutional or pooling layer, the relevance score for neuron d (i.e. a pixel in a feature map in
resulting feature maps are concatenated into a feature vector, hidden layers) in layer l. The relevance in a given layer is
which is fully connected via weights to the classification distributed to the neurons in the preceding layer depending
layer. The values at the classification layer resulting from the on the layer type. For fully-connected and convolutional
weighted multiplications of the fully-connected layer values layers, the relevance scores in the preceding layer are
are passed through a final non-linear activation function, computed as
such as the sigm() function, to obtain a value in the range of zij
0 to 1 specifying the likelihood the input corresponds to the Ri(l ) R (j l 1) (3)
class associated with that value. The weights and biases in j z sign( i ' zi ' j )
i' i' j
the network are learned during a training process such that a where sign() is equal to the sign of the argument of this
satisfactory percentage of inputs are correctly classified on a function and zij is the contribution of the activation at
set of testing data. Learning is achieved through use of the neuron i towards the total pre-activation of neuron j, z j ,
stochastic gradient descent method for optimizing an
objective function related to the error between the actual which is defined as zij ai(l ) wij(l ,l 1) ,
output of the network and the desired output. The gradients where ai( l ) is the activation of neuron i in layer l and wij(l ,l 1)
required for this method can be computed by the
is the weight connecting neurons i and j, and ε is a numerical
backpropagation method [21].
stabilizer. This propagation rule implies that, for fully-
2.2. LRP-based pixel-wise explanation connected and convolutional layers, the neurons in the
preceding layers receive a share of the relevance scores based
Once a DCNN has been trained to classify the images on their relative contribution to the next layer values in the
with satisfactory performance, LRP is applied to interpret the forward pass. For pooling layers, the relevance scores are up-
classification decisions made by the DCNN. LRP sampled to match the output dimensions of the previous layer
decomposes an output of a trained DCNN into a set of and scaled by the scaling factor between the layers.
relevance scores for pixels in the associated input image, Because the pre-activation contribution of a single neuron
which quantify the contribution of each pixel to the output can have different sign than the total pre-activation due to all
values. Such relevance scores are leveraged in this paper to neurons in the same layer, the propagated relevance scores
Fig. 2. Framework for LRP-based explainable DCNN for gearbox fault diagnosis
John Grezmak et al. / Procedia CIRP 80 (2019) 476–481 479
4 Grezmak et al./ Procedia CIRP 00 (2019) 000–000
according to Eq. (3) can take on positive or negative values. crafted fault types: no fault (normal), slight crack, large
Neurons with relevance scores that take on positive values crack, or missing tooth. The shaft connected to the 55 tooth
are interpreted as having an overall net positive contribution driven gear is connected to a brake. Vibration sensors are
to the classifier output. Thus, input pixels with positive mounted on the gearbox housing near the bearings in four
relevance can be interpreted as evidence in the input image locations. In this study, vibrations signals from a sensor
for a particular classification. In the same manner, neurons mounted near the driving gear are used for testing and
with negative relevance scores can be interpreted as evidence training the DCNN.
against a particular classification. A higher relevance score
denotes stronger evidence for or against a particular
classification depending on the sign of the relevance score.
4.4 LRP-based explanation of DCNN’s classifications Additionally, pixels near the image edges tend to have
neutral relevance scores, since they have sparser connections
The LRP method is used to obtain the relevant features in
to the proceeding layer neurons than pixels nearer to the
the time-frequency spectra of the testing data for the
center, and thus receive a smaller amount of relevance.
DCNN’s classifications. The final layer relevance score for
In the 500 Hz to 1500 Hz range, the relevance scores
each sample is set as the value corresponding to the predicted
appear to follow patterns specific to the associated fault type.
class (i.e. the values corresponding to all other classes are set
Specifically, bands of positive and negative relevance with
to 0), and relevance is propagated to the input pixels using
nearly constant widths along the frequency axis are shown to
the propagation rules described in Section 2.2, with a
span across the time domain of the images, with each fault
numerical stabilizer value of 0.1. To visualize the relevant
type displaying unique band locations. This suggests that the
features in the inputs, the relevance scores for each sample
DCNN classifier has learned to differentiate between the
are normalized and plotted as a heatmap such that positive
underlying fault types from the time-frequency images by
and negative relevance is shown in hot and cold hues,
detecting differences in the relative shapes of the frequency
respectively, with the colormap centered at relevance scores
distributions as represented in the time-frequency spectra,
of zero, which indicate no significant positive or negative
which have a near cyclic pattern in time with the gear
contribution to a classification.
revolution. This is intuitively plausible, since changes in the
5. Results and Discussions frequency distribution are expected to occur in faulty gears
as a result of sideband amplitude changes, which are a
Heatmaps of relevance scores for 4 time-frequency function of fault type and severity [23].
spectra images for each fault type are shown in Fig. 4. Each The fast Fourier Transform of a vibration signal for each
image represents a 0.5 s time-frequency spectra (roughly 15 fault type is shown in Fig. 5. The differences between these
gear revolutions per image) of the vibration data frequency spectra can be easily seen by visual inspection. For
corresponding to a given fault type. The vertical axis shows example, the amplitude decrease in the frequency range of
the frequency values from the time-frequency spectra that the 1100 to 1200 Hz for the slight crack distinguishes this fault
pixels are associated with. The colormaps have been chosen type from the normal gear, and the amplitude decrease in the
such that red colors correspond to positive relevance, blue frequency range of 1200 to 1300 Hz for the missing tooth
colors correspond to negative relevance, and green distinguishes this fault type from the large crack. In Fig. 4,
corresponds to neutral (approximately zero) relevance. bands of positive or negative relevance can be seen at these
Most of the positive and negative relevance falls within frequency locations with opposite signs between the fault
frequency range of just below 500 Hz to 1500 Hz, while classes (e.g., a band of positive relevance is seen at 1100 to
neutral relevance is dominant outside of this range. This is 1200 Hz for the slight crack, while a band of negative
expected because the vibration components outside of this relevance is seen at the same frequency range for the normal
range are very small in comparison to those within, which gear). This suggests that the DCNN has been trained to
are more closely related to the first harmonic of the meshing distinguish between the fault classes based on these changes
frequency and its sidebands, and vibration components with in the frequency spectra, as represented in the time-frequency
very small wavelet coefficients from the CWT will be input images.
assigned a value of 0 when the spectrum is converted to a
gray-scale image. According to the propagation rule of Eq. 6. Conclusions
(3), any inputs of 0 will be assigned a relevance score of 0, This paper presents an explainable DCNN, developed on
since they do not contribute to the output layer values. the basis of layer-wise relevance propagation, for gearbox
1500 Negative band
(1100-1200 Hz)
Normal
1000
500
1500
Positive band
Missing Tooth Large Crack Slight Crack
(1100-1200 Hz)
1000
Frequency (Hz)
500
1500 Negative band
(1200-1300 Hz)
1000
500
1500
Positive band
1000 (1200-1300 Hz)
500
0 0.5 1 1.5 2
Time (s)
Fig. 4. LRP results for 4 sample inputs from each fault types
John Grezmak et al. / Procedia CIRP 80 (2019) 476–481 481
6 Grezmak et al./ Procedia CIRP 00 (2019) 000–000