Вы находитесь на странице: 1из 5

Evolvin

ural

twork

els

Yoshiaki Tsukamoto, Akira Namatame


Dept. of Computer Science National Defense Academy Yokosuka, Kanagawa, 239 { tsuka,nama}@cc.nda.ac.jp

Abstract- Neural networks in nature are not designed but evolved, and they should learn their structure through the interaction with their environment. This paper introduces the notion of an adaptive neural network model with reflection. We show how reflection can implement adaptive processes, and how adaptive mechanisms are actualized using the concept of reflection. Learning mechanisms must be understood in terms of their specific adaptive functions. We introduce an adaptive function which makes the network to be able to adjust its internal structure by itself to by modifying its adaptive function and associated learning parameters. We then provide the model of emergent neural networks. We show that the emergent neural network model is especially suitable for constructing large scale and heterogeneous neural networks with the composite and recursive architectures, where each component unit is modeled to be another neural network. Using the emergent neural network model, we introduces the concepts of composition and recursion for integrating heterogeneous neural network modules which are trained individually.

I . INTRODUCTION The ability to learn is the most important property of the living systems. Evolution and learning are the two most fundamental adaptation processes and their relationship is very complex. Studying the evolution and development processes of biological systems can reveal how the structures are formed through interactions with the environment of the living systems in nature. These structural adaptation mechanisms in biological systems can suggest ways of building adaptable structure so that finally the network grows t o a configuration suitable class of problems characterized by the training patterns. The initial step is t o explore the aspects of this relationship by defining the process of evolution as the process of learning procedure t o be adjusted[4]. Taking this approach, we view evolution as a kind of an adaptive process of learning mechanism. Here, the learning process itself is the object of evolution. Current neural network models allow the networks to adjust their behavior by changing the interconnections weights associating neurons to each other, but the architecture of the network must be set up by system designers and once the structure is designed, it has to remain fixed. This sets a quite constraint on the applicability of neural network models. In most of the current neural network models, learning is done through modification of the synaptic weights of neurons in the network [1][5][8]. This kind of learning is basically a parameter adaptation process. The structure of neural networks should be evolved and developed rather than pre-specified, and we need t o develop a framework for an adaptable process. We introduce an adaptive neural network model with reflection as

a framework for an adaptive process evolving in a dynamic environment. Adaptation is viewed either as a modification of ones behavior or as a modification of ones environment. Learning mechanisms must be understood in terms of their specific adaptive functions. We introduce an adaptive function of a neural network, and an self reflective or an adaptive process of a neural network is modeled to adjust its internal structure to evolving environments by modifying its adaptive function and its associated learning parameter. We also investigate reflective learning among multiple neural network modules[7]. In multiple modules setting, two types of reflection may occur: each network module learns on its own by adjusting its adaptive function and its associated learning parameter, while at the same time, each network module mutually interacts and learns as a group to obtain the coordinated adaptive functions and learning parameters.

11. FORMULATION OF ADAPTIVE

NEURAL NETWORKS

A. Definition of a n adaptive f u n c t i o n
The weighing evidence scheme is that it would not demand that every feature of an object be present, instead it would only weigh the evidence that object is present. That is, a knowledge object becomes active whenever the weighted summation of the present features proceed the threshold level. The number weights are assigned to each feature, and this is based on the theme of weighing evidence. We denote the set of objects by W = (0; : i = 1 , 2 , . .. , k } . The list of the feature values is represented as d = ( d l , &,. .. ,d,) E D where the variables d l , d 2 , . . . ,d, take the Boolean values. We denote training examples as the ordered pairs langledt, e t ) where dt 6 D and et E {0, l}. The set of the ordered pairs (D, C ) = [ ( d t ,c t } : t = 1,2,. . .,TI, is termed as a training set.

Definition 1 Let C + and C - be the set of the negative and positive examples of the concept C. The summations of the i n n e r vectors of Boolean vectors d,, d , E D, denoted as G + ( d p ) and G-(d,) are defined as:

0-7803-2902-3/96/$4.00

01996 IEEE

689

where
i=l

Message

\*

We define the similarity matrix of the training set D under the concept C by the T x 2 matrix defined as

(3)
In the next theorem, we provide the procedure of obtaining the activation function as the method of each object.

1 Message Processing 2 Training Set Processing

Theorem 1 [6] Suppose the similarity m a t r i x T(D, C ) i s linearly separable. W e define the connection weights wi, i = 1 , 2,..., n as,

d,EC+

dgECFig. 2 . The componen:t of a Neuro-Agent.

and the activation f u n c t i o n with the above connection weights as


n

F(z)=
i= 1

wixi

+0

(5)

Neuro Ob-jectl

T h e n , the activation f u n c t i o n becomes the linear threshold f u n c t i o n of a neuro-object, that is, it satisfies (i) For a positive training example: d, E C+ + F(d,) (ai) For
U

>0 <0
Neuro Object;!

negative training example:


dh E C -+F(r2;)

B. Reflection in a dynamic environment A vital component of the learning process is the environment. If the environment was relatively static, there might
be little need for learning to evolve. But if the environment is dynamic and diverse, innate environment specific mechanisms are of little use. In this way, a dynamic environment encourages of the evolution of learning. In this section, we proposes a general framework for self-reflective adaptation to allow a neural network to change its structure autonomously and consider the effect of an environmental change on the evolution of learning is investigated. Figure 1 shows the self-reflective process to a new learning environment by adjusting the adaptive function and learning parameters. 111. A MODEL OF NEURO AGENTS Many real world situations are currently being modeled as a set of cooperating intelligent agents. We explore an approach to specifying neural networks using the neuroagent model. We use the neuro-agent model as a neutral metaphor for describing massively parallel and distributed processing elements. The neuro-agent model uses the object oriented approach for the specification of neural network with the self-growth capability. The components of the neuro-agent model is shown in Figure 2. The neuroagent model facilitates the knowledge processing and the

Neuro Object3

Fig. 3. The component of neuro-objects with parallel processing capability via a message passing.

internal memory. The knowledge processing consists of the message processing component, the training set, and the learning mechanism. The message processing component is the part of the communication board with other neuroagents. The neuro-agent model assumes no specific inter processing protocol other than message passing. The internal memory of a neuro-agent is made of many smaller processes, termed as neuro-objects as shown in Figure 3. Each neuro-object has procedures called methods that specify its behavior specified by an xtivation rule. Each neuroobject becomes active when it, receives a message to invoke a method. A neuro-object cain learn the types of messages that to initiate. The component of the internal memory contains domain knowledge and expertise as a set of those neuro-objects. Multiple methods of neuro-objects can be activated by a message and this property of activating several methods simultaneously improve the parallel processing capability. To adapt successfully t o its environment, a neuro-agent needs to learn from previous experiences and apply the learned knowledge to subsequent situations. A

690

A Initial Training Set

(D, C)

Fig. 1. A self learning process t o a new learning environment.

neuro-agent learns its way of the response to the message by the kind of weighing evidence learning method. The weighing evidence scheme is that it would not demand that every feature t o be present, instead it would only weigh the evidence that feature is present.

where

I(Di, Dj, C j ) = DiDT[Cj, 1 - C j ]

(7)

Iv. A

MODEL OF EVOLVING NEURAL NETWORKS

A . Composition of homogeneous network modules


In this section, we discuss cooperative learning among multiple homogeneous neural network modules, and show the coordination mechanism. We consider composite learning composed of a set of modules, each coupled with a different subset of the whole learning space T = (D,C). The training data set T = (D,C) is composed from the subsets Ti, i = 1 , 2 , . . . , n where Ti, i = 1,2,. . . ,n, represents the i-th subset of T = (D,C). In this method of composition, however, the results of the distributed modules are under-specified with respect to the whole problem [2][3][8]. In the next theorem, we will show the properties of each subset of the training set essentially determine the learn ability of the whole training set. We denote the training set of each network module as Ti = (Di,C i ) , i = 1 , 2 , . . . , n. The integrated training set is described as T = (D, C ) = Vy=l Ti. The aggregate matrix of each network module is represented by T(Di, Ci), where T(Di, C i ) are the similarity matrices respectively. At the composite phase, each network module trained with the initial training set T(Di, C i ) , i = 1 , 2 , . . . , n , modifies its adaptive function for coordination as follows: Each network module, i = 1 , 2 , . . . ,n, modify its similarity matrix as follows:

Theorem 2 If the similarity matrices T;(D, C ) are lanearly separable on learning parameter L = (a,P,e), then the whole training set (D, C ) is also linearly separable on learning parameter L = (a,P,0). The connection weights w i , i = 1,2,. .. ,m, are given as the summation of the weights w!, j = 1,2,. . . ,n obtained from each subset of each subset (Dj, C j ) , j = 1 , 2 , . . . , n of the training set (D, C ) .
m

w; = c w ;
i=l

(9)
We now discuss composite learning, in which some training subset may contain the inconsistent training examples: the concept value C may be different for the same input pattern. That is, the training subset (Di,C i ) , i = 1 , 2 , . . . , n may contain the input patterns d t E Di, d, E D j , i # j such that d t = d, and et # cs. Since the target concept is determined by the combination of the decomposed input space, this kind of inconsistency may occur for the composition of the training subsets. The inconsistency problem is handled by generating the appropriate intermediate concepts H = [HI,H2,. . . , H,] so that each training subset (Dz, Ci) becomes to be the consistent training set under the intermediate concept Hi, i = 1 , 2 , . . . , n by generating hidden units as follows: Suppose there are some inconsistent m training examples d l , dz, . . . , d, E Di such that they are the same input patterns d l = d2 = . . . = d, with the different concept values. In this case, we generate

69 1

a hidden unit corresponding those inconsistent input patterns such that it classifies those input patterns as the positive training examples and the others as the negative examples. Then the training subset (Di, Hi),i = 1,2,. . , ,n become to be the consistent training set. The concept C is then represented as the disjunctive form of those intermediate concepts.

with the class-subclass relation such as for Ci,Cj E C

Ci

+ Cj

o:r

Ci 4 Cj

(15)

E. Composition of heterogeneous network modules


In this section, we discuss cooperative learning among multiple heterogeneous neural network modules, and show the coordination mechanism. We denote the training set of each network module as Ti = (Di, C ) , i = 1 , 2 , . . . ,n. The integrated training set is described as T = (D,C) where Di c D, Vrz1 Di = D. The aggregate matrix of each network module is represented by T(Di, C ) , where T(Di,C) are the similarity matrices respectively. At the composite phase, each network module trained with the initial training set T(Di, C),i = 1,2,. . . , n , modifies its adaptive function for coordination as follows: Each network module, i = 1,2,. . .,n, modify its similarity matrix as follows:
n

is defined as a generalization hierarchy. A generalization hierarchy is a class-subclass hierarchy in which objects belonging t o a subclass inherit ithe properties of their immediate superclass. Each object in W can be described by a bit of vectors. The i-th attribute of each object is described by a vector $i of length mi, the j - t h position of the vector $; being either 1 or 0, indicating that the j - t h value of the attribute A; is or is not present, respectively. Each object, therefore, consists of a set of the vector (T,!J1,42,. . . ,$ , ) of length mi. Each object is classified into several classes =c 1 x x x ck. Each element {cl, c2,. . . ,C I , } of C is termed as a class value ,and ci takes one if an object Oi belongs to the class Ci anid zero otherwise. An object Oi is then represented as a vector of pairs of the form

c,

Ti(D, C) =
i=l

T(Di,C)

(10)

Theorem 3 I f the similarity matrices T;(D,C) are linearly separable o n learning parameter L = ( a ,p, Oi), t h e n t h e whole training set (D, C) i s also linearly separable o n learning parameter L = ( a ,p, 0 = O i ) . T h e connect i o n weights wi, i,= 1,2, . . . ,m , are given a s the s u m m a t i o n of the weights w:,j = 1,2,. . . ,n obtained f r o m each subset of each subset (Dj, C j ) , j = 1,2,. . . ,n of the training set (D, C ) .

where $ A (0;) ~ is termed as the attribute characteristic function defined over the attribute subspace of Ai, and ycj (Oi) is termed as the class characteristic function defined over the class Cj. The attribute characteristic function defined over the attribute space A = A1 x A2 x . . . x A , is denoted as

Similarly, the class characteristic function defined over the class space C = C1 x C2 x * . x CI,is defined as
Y ~ ( 0 i= ) { Y C (Oi), ~ Yc,, (Oi), . . . ,pc, ( 0 ; ) )

(18)

C. Constructing large scale neural networks


This section provides the abstract design model of neural networks model. An abstract model is especially suitable for constructing large scale and heterogeneous neural networks. The large scale and heterogeneous neural networks are made up of many small scale networks which are trained individually. We consider a set of objects W = Oi : i = 1,2,. . . ,n and each object is characterized with n attributes, A I ,A z ,. . , A , with the domain D o m ( A ; ) , i= 1 , 2 , . . . ,n.

The set of those characteristic functions over the whole object space W = 0; : i = 1,2,. . . ,k , denoted as

Definition 2 A n y subset C i of the object space W, termed


as a class, is defined as

C i = { A I : D o m ( A l ) ,A2 : Dom(Az),. . . ,
A, : Dom(An)}
The sets of these classes (13)

defines the training set. We propose the hierarchicaJ learning to specify the class code p ~ ( O i ) which , allows all ancestors and descendants of any class in a generalizati.on hierarchy can be inferred simultaneously. With this hierarchical learning algorithm, newly generated classes can be self-organized into the preexisting generalization hierarchy C. We assume the hierarchical relation p property for a pair of classes Ci, Cj E C. We define the indices for the elements of C that are learning obtained by the following hierarchical learning algorithm.
Step 1 . 1

c = {Cl,C2,...,ClC}

(14)

Each element of 27 is locally coded. That is, we provide the local code ri to C;, the i-th element of S, where the i-th element of ri is 1 and zeros for the others.

692

meta-networkmodule

Fig. 4 . Evolving networks with t h e composite and recursive architecture.

S t e p 1.2 Each class Cj inheritance the hierarchical code from its immediate ancestor class. That is the class charj after the inheritance is modified as acteristic code of C follows: r j = r; CE rsum (21) where S t e p 1.3 For the immediate descent class Cj of C; , the class code of Ci is inherited as the class code of Cj as follows: ~j = ri CE r j (22) where represents the bit OR. We have the following property as with the class codes.

network module can also learn on its own by adjusting its adaptive function. The overall system is constructed from several interacting learning modules. We then discussed how an assembly of modules cooperate to learn as a function of their individual internal states. Many separate network modules, each of which is combines their individually defined adaptive functions by coordinating learning parameters. We have also used the agent oriented model as architecturally neutral metaphor for describing massively parallel distributed and cooperative neural network modules. An agent oriented model for the specification of neural network models consists of cooperative distributed processing elements called autonomous neuro-agent with self reflection. Each neuro-agent is encapsulating a specific set of knowledge obtained from a diffcrcnt training set. At the cooperative stage, neuro-agents put forward their learnt knowledge to obtain coordinated learning parameters,

REFERENCES

[l] Barnden, J. A. and Pollack, J. B.: A High-Level connectionist Models, Ablex Publishing (1991). Modular Learning in Neural Networks, Wiley-Interscience (1992). [3] Jacobs, R. A., Jordan, M. I. et al.: Adaptive Mixtures of Local Experts, Neural Computation, Vol. 3 , pp, 7987 (1991). [4] Lee, T.(ed.): Structure Level Adaptation for Artificial Neural Networks, Kluwer Academic Publishers (1991). [5] Nadal, J.: Study of a growth algorithm for a feedforward network, International Journal of Neural Neworks, Vol. 1, pp. 55-59 (1989). 161 Namatame, A. and Tsukamoto, Y . : Structural Connectionist Learning with Complementary Coding, International Journal of Neural Systems, Vol. 3, No. 1, pp. 19-30 (1992). [7] Sian, S. S.: The Role of Cooperation in Multi-agent Learning Systems, Cooperative Knowledge-Based Systems 1990 (Deen, S. M.(ed.)), Springer-Verlag, pp. 6784 (1990). [8] Sikora, R.: Learning Control Strategies for Chemical Processes, IEEE EXPERT, Vol. 7, No. 3, pp. 35-43 (1992).

[a] Hrycej, T.(ed.):

Lemma 1 For a pair of class codes r j and r;


ri @3 r j

= rj

~j

if C i + Cj otherwize

(23)

The attribute characteristic function $IA(O;) is obtained b y applying the following equation:

Using the above coding scheme, we can generate the emergent neural networks with the composite and recursive architecture as shown in Figure 4. V. CONCLUSION This paper showed how an adaptive neural network model with reflection can implement adaptive processes. We introduced the concept of a n adaptive function, and learning mechanisms were understood in terms of their specific adaptive functions. A neural network module was modeled to adjust its internal state to evolving training environments by modifying its adaptive function. We also investigate reflective learning among multiple network modules. In multiple modules setting, two types of reflection were considered: each network module mutually interact and can learn as a group, while at the same time, each

693

Вам также может понравиться