Академический Документы
Профессиональный Документы
Культура Документы
Robert Stengel
Robotics and Intelligent Systems,
MAE 345, Princeton University, 2013"
Learning Objectives!
Left pseudoinverse"
Steepest descent"
Back-propagation"
Exact algebraic fit"
Copyright 2013 by Robert Stengel. All rights reserved. For educational use only.!
http://www.princeton.edu/~stengel/MAE345.html!
Applications of
Computational
Neural Networks"
Classification of data sets"
Nonlinear function approximation"
Efficient data storage and retrieval"
System identification"
Nonlinear and adaptive control systems"
Neurons"
Biological cells with
significant electrochemical
activity"
~10-100 billion neurons in
the brain"
Inputs from thousands of
other neurons"
Output is scalar, but may
have thousands of branches"
Neural Action
Potential"
Neglecting
In theabsolute
limit, neglecting
refractory
absoluteperiod
refractory
"
period"
Multipolar Neuron"
Mathematical Model of
Neuron Components"
Synapse effects represented by weights
(gains or multipliers)"
Neuron firing frequency is modeled by
linear gain or nonlinear element"
w21"
w22"
w11"
w23"
w12"
w13"
r = wT u + b
u = s (r )
Layout of a
Neural Network"
Input-Output
Characteristics of a
Neural Network Layer"
Single layer"
Number of inputs = n!
dim(u) = (n x 1)"
Number of nodes = m!
dim(r) = dim(b) = dim(s) = (m x 1)"
r = Wu + b
u = s (r)
wT
1
wT
W= 2
w Tn
Two-Layer Network"
Two layers"
Number of nodes in each layer need
not be the same"
Node functions may be different, e.g.,"
Sigmoid hidden layer"
Linear output layer"
y = u2
= s 2 ( r2 ) = s 2 ( W2 u1 + b 2 )
= s 2 W2 s1 ( r1 ) + b 2
= s 2 W2 s1 ( W1u 0 + b1 ) + b 2
= s 2 W2 s1 ( W1x + b1 ) + b 2
Is a Neural Network
Serial or Parallel?"
3rd-degree power series"
4 coefficients"
y = a0 + a1 x + a2 x 2 + a3 x 3
= a0 '+ a1 'r + a2 'r 2 + a3 'r 3
= a0 '+ a1 ' ( c1 x + b1 ) + a2 ' ( c1 x + b2 ) + a3 ' ( c1 x + b3 )
2
= w0 + w1s1 ( u ) + w2 s2 ( u ) + w3 s3 ( u )
Is a Neural Network
Serial or Parallel?"
Power series is serial, but it can be expressed as a
parallel neural network (with dissimilar nodes)"
1, r > 0
u = s(r) =
0, r 0
u = s(r) =
1
1 + e r
u = s(r) =
1 + e ( w1r1 + w2 r2 +b )
Perceptron Neural
Network"
Single-Layer, Single-Node
Perceptron Discriminants"
1, (w T x + b) > 0
u = s(w x + b) =
T
0, (w x + b) 0
T
w1 x1 + w 2 x 2 + b = 0
1
or x 2 =
(w1 x1 + b)
w2
x
x = 1
x 2
w1 x1 + w 2 x 2 + w 3 x 3 + b = 0
1
or x 3 =
(w1 x1 + w 2 x 2 + b)
w3
x1
x = x 2
x 3
Single-Layer, Multi-Node
Perceptron Discriminants"
u = s(Wx + b)
Multiple inputs, nodes, and outputs"
More inputs lead to more dimensions in
discriminants"
More outputs lead to more discriminants"
x
x = 1
x 2
x1
x = x 2
x 3
Sigmoid Activation
Functions"
! Alternative sigmoid functions"
! Logistic function: 0 to 1"
! Hyperbolic tangent: 1 to 1"
! Augmented ratio of squares: 0
to 1"
u = s(r) =
1
1 + e r
1 e2r
u = s(r) = tanh r =
1 + e2r
u = s(r) =
r2
1 + r2
Sigmoid Neural
Network"
Thresholded Neural
Network Output"
Threshold gives yes/no output"
T x + b
y = r = w
= y yT
1
1
1
2
J = 2 = ( y yT ) = ( y 2 2 y yT + yT 2 )
2
2
2
= y yT
1
1
1
2
J = 2 = ( y yT ) = ( y 2 2 y yT + yT 2 )
2
2
2
J
=0
p
Optimality condition"
p1
w p2
p=
=
b ...
pn +1
J
y
y r
= ( y yT )
= ( y yT )
p
p
r p
where
Gradient"
r r
=
p p1
r
p2
...
T
r (w x + b) T
=
= x
pn+1
p
Steepest-Descent
Learning for a Single
Linear Neuron"
Gradient"
J
= ( y yT ) xT
p
Steepest-descent
algorithm"
1 = w T x + b yT xT
= learning rate
k = iteration index(epoch)
T
x
J
p k+1 = p k = p k yk yTk k
pk
1
xk
w
w
T
=
( w k x k + bk ) yTk
1
b k+1 b k
Backpropagation for a
Single Linear Neuron"
Training set (n members)"
Target outputs, yT (1 x n)"
Feature set, X (m x n)"
yT
yT1
=
x1
yT2
x2
xk
w
w
T
=
w k x k + bk yTk
1
b k +1 b k
... yTn
... x n
Estimate w and b
recursively"
Off line (random or
repetitive sequence)"
On line (measured
training features and
target)"
Steepest-Descent
Algorithm for a SingleStep Perceptron"
Neuron output is
discontinuous"
1, r > 0
y = s(r) =
0, r 0
xk
w
w
=
yk yTk
1
b k +1 b k
( y
yTk
1,
= 0,
1,
yk = 1, yTk = 0
yk = yTk
yk = 0, yTk = 1
yT yT1
=
X x1
yT2
x2
... yTn
... x n
1
1 + e r
r
dy ds(r)
e
=
=
2
dr
dr
(1+ e r )
= y yT
1 2 1
1 2
2
2
= e r s 2 (r) J = 2 = 2 ( y yT ) = 2 ( y 2 y yT + yT )
Control parameter"
= (1+ e r ) 1 s 2 (r) =
1 s 2 (r)
s(r)
1 s(r) 2
=
s (r) = [1 s(r)] s(r) = (1 y ) y
s(r)
p1
w p2
p=
=
b ...
pn +1
Training a Single
Sigmoid Neuron"
J
y
y r
= ( y yT )
= ( y yT )
p
p
r p
where
r = wT x + b
dy
= (1 y ) y
dr
r T
= x 1
p
J
= ( y yT ) (1 y ) y xT
p
J
p k+1 = p k
pk
or
xk
w
w
=
( yk yT ) (1 y ) yk
1
b k+1 b k
Training a
Sigmoid Network"
Two parameter vectors for
2-layer network"
w
p1,2 =
b 1,2
p1
p
= 2
...
pn +1
1,2
Output vector"
y = u 2
= s 2 ( r2 ) = s 2 ( W2 u1 + b 2 )
= s 2 W2 s1 ( r1 ) + b 2
= s 2 W2 s1 ( W1u 0 + b1 ) + b 2
= s 2 W2 s1 ( W1x + b1 ) + b 2
Training a
Sigmoid Network"
T
p1,2 k
where"
J
= p1,2 k
p 1,2 k
J
y
y r1,2
= ( y yT )
= ( y yT )
p1,2
p1,2
r1,2 p1,2
where
r1,2 = W1,2 u 0,1 + b1,2
(1 y ) y
0
1
1
0
y
y
(1 y2 ) y2
= I;
=
r2
r1
...
...
0
0
r1 T
r2 T
= x 1 ;
= u
1
p1
p2 1
...
0
...
0
... (1 yn ) yn
...
Desmoplastic small,
round blue-cell tumors"
1
2
3
4
Overview of Present
SRBCT Analysis"
Clustering of SRBCT
Ensemble Averages"
EWS set"
BL set"
NB set"
RMS set"
Binary vector
output (0,1)
after rounding!
Transcript!
Training!
Set!
Sample 1
EWS
EWS(+)Average
EWS()Average
BL(+)Average
BL()Average
NB(+)Average
NB()Average
RMS(+)Average
RMS()Average
Sample 2
EWS
Sample 3
EWS
...
...
Sample 62
RMS
EWS(+)Average
EWS()Average
BL(+)Average
BL()Average
NB(+)Average
NB()Average
RMS(+)Average
RMS()Average
EWS(+)Average
EWS()Average
BL(+)Average
BL()Average
NB(+)Average
NB()Average
RMS(+)Average
RMS()Average
...
...
...
...
...
...
...
...
EWS(+)Average
EWS()Average
BL(+)Average
BL()Average
NB(+)Average
NB()Average
RMS(+)Average
RMS()Average
EWS(+)Average
EWS()Average
BL(+)Average
BL()Average
NB(+)Average
NB()Average
RMS(+)Average
RMS()Average
SRBCT Neural
Network Training"
! Neural network"
! 8 ensemble-average inputs"
! various # of sigmoidal neurons"
! 4 linear neurons"
! 4 outputs"
! 100% accuracy"
Sample 63
RMS
Leave-One-Out
Validation of SRBCT
Neural Network"
!
!
!
!
!
Novel-Set Validation of
SRBCT Neural Network"
! Network always chooses one of four
classes (i.e., unknown is not an option)"
! Test on 25 novel samples (400 times each)"
!
!
!
!
!
5 EWS "
5 BL "
5 NB "
5 RMS"
5 samples of unknown class"
Next Time:
Neural Networks - 2
Supplementary,Material,
Axon
Hillock!
Adaptation/Learning
(potentiation)"
Short-term"
Long-term"
Sample 1
Sample 2
Sample 3
Tumor
Tumor
Tumor
Identifier
...
...
...
Gene
m
Level
Gene
m
Level
Gene
m
Level
...
...
...
...
...
...
Normal
Normal
...
...
Sample n 1
Sample n
"82
"14
"75
"-73
"24
"101
"74
"-4
"-2
"25
"35
"45
"-48
"47
"149
"-28
"38
"48
"97
"-9
"..."
"75
"21
"-13
"..."
"..."
"..."
"-73
"-22
"-9
"49
"32
"6
"3
"20
"5
"-5
"-35
"10
"-13
"-12
"23
"36
"-16
"9
"3
"70
"14
"28
"5
"50
"..."
"..."
"..."
"..."
Neural Network
Classification Example"
~7000 genes expressed in 62 microarray
samples"
Tumor = 1"
Normal = 0"
Zero classification
errors"
Classification ="
Columns 1 through 13 "
1 1 1 1 1 1
Columns 14 through 26 "
1 1 1 1 1 1
Columns 27 through 39 "
1 1 1 1 1 1
Columns 40 through 52 "
1 0 0 0 0 0
Columns 53 through 62 "
0 0 0 0 0 0
1"
1"
1"
0"
0"
0 = Normal"
1 = Adenoma"
2 = A Tumor"
3 = B Tumor"
4 = C Tumor"
5 = D Tumor"
Target"= ""
"[2"1"3"3"3"3"3"3"3"3"
"3"3"3"3"3"3"3"3"3"3"4"
"4"4"4"4"4"4"4"4"5"5"5"
"5"5"5"5"5"1"0"0"0"0"0"
"0"0"0"0"0"0"0"0"0"0"0"
"0"0 "0"0"0"0]"
One classification error"
Scalar network
output with varying
magnitude!
Classification ="
Columns 1 through 13 "
2 1"" 3 3 3 3 3
Columns 14 through 26 "
3 3 3 3 3 3 3
Columns 27 through 39 "
4 4 4 5 5 5 5
Columns 40 through 52 "
0 0 0 0 0 0 0
Columns 53 through 60 "
0 0 0 0 0 0 0
3"
4"
0"
0"
0"
Ranking by EWS t
Values (Top and Bottom 12)"
!
-5.04
-5.04
-5.04
-5.06
-5.23
-5.30
-5.38
-5.80
-5.80
-6.14
-6.39
-9.26
-1.05
-3.31
2.64
-1.45
-7.27
-4.11
-0.42
0.03
-5.56
0.60
-0.08
-0.13
9.65
-3.86
2.19
5.79
0.78
2.20
3.76
-1.58
3.76
0.66
-0.22
3.24
-0.62
6.83
0.64
0.76
5.40
3.68
0.14
5.10
3.66
3.80
4.56
2.95
Repeated for BL vs. remainder, NB vs. remainder, and RMS vs. remainder"
Gene Description
insulin-like growth factor 2
296448 (somatomedin A)
EWS
Student t
Value
BL
Student t
Value
NB
Student t
Value
RMS
Student t
Value
Most
Khan
Significant Gene
t Value
Class
-4.789
-5.226
-1.185
5.998 RMS
RMS
-4.377
-5.424
-1.639
5.708 RMS
RMS
cyclin D1 (PRAD1:
841641 parathyroid adenomatosis 1)
365826 growth arrest-specific 1
486787 calponin 3, acidic
6.841
3.551
-4.335
-9.932
-8.438
-6.354
0.565
-6.995
2.446
-4.300 BL (-)
1.583 BL (-)
2.605 BL (-)
EWS/NB
EWS/RMS
RMS/NB
12.037
-4.174
-6.673
-4.822
-6.173
-3.484
-4.792 EWS
5.986 RMS
EWS
RMS
0.058
4.715
-9.260
-7.487
-4.576
-0.133
-1.599
-3.834
3.237
2.184 BL (-)
-3.524 EWS
2.948 EWS (-)
Not BL
EWS
Not EWS
!!
%
!
premnmx(TrainingData,Target);!
=
=
=
=
=
=
TrainingData(:,iSam);!
TrainingData;!
[];!
Target;!
[];!
2; !
%
%
NovelOutput
=
sim(Net,ValidSample);!
LengthNO
=
length(NovelOutput);!
NovelRounded
=
round(NovelOutput);!
NovelRounded
=
max(NovelRounded,zeros(LengthNO,1));!
NovelRounded
=
min(NovelRounded,ones(LengthNO,1));!
!
If no actual output is greater than 0.5, choose the largest
for k = 1:SizeNO(2)!
!
if (isequal(NovelRounded,zeros(LengthNO,1)))!
[c,j]
=
max(NovelOutput);!
NovelRounded(j,1)
=
1;!
end!
!
AbsDiff
=
abs(NovelOutput - NovelRounded);-!
If two rounded outputs are "1", choose the one whose actual output is!
closest to "1"
!
for j = 1:(LengthNO - 1)!
if NovelRounded(j) == 1!
for k = (j + 1):LengthNO!
if NovelRounded(k) == 1!
if (AbsDiff(j) < AbsDiff(k))!
NovelRounded(k) = 0;!
else!
NovelRounded(j) = 0;!
end!
end!
end!
end!
end!
!
NovelError
=
Target(:,iSam) - NovelRounded; !
iSam * Repeats
Algorithm for
Network Training"
Tc :
Sc :
Ac :
Rc :
c :
c :
Throttle command
Spoiler command
Aileron command
Rudder command
Pitch angle command
Yaw rate command