100%(1)100% нашли этот документ полезным (1 голос)

319 просмотров7 страницIn this paper, we present a fast, hybrid gradient
descent and genetic algorithm for training recurrent neural
networks. The hybrid algorithm uses the strengths of genetic
algorithm and gradient descent learning in training
recurrent neural networks (RNNs) for learning fuzzy finite
automata. In the hybrid algorithm, the chromosomes are
evolved using one-step gradient descent with genetic
evolution. The hybrid algorithm is applied in learning
deterministic finite-state automata using recurrent neural
networks. The surprising results demonstrate that the hybrid
algorithm trains recurrent neural networks faster when
compared to training with regular genetic algorithm alone.

Aug 08, 2008

© Attribution Non-Commercial (BY-NC)

PDF, TXT или читайте онлайн в Scribd

In this paper, we present a fast, hybrid gradient
descent and genetic algorithm for training recurrent neural
networks. The hybrid algorithm uses the strengths of genetic
algorithm and gradient descent learning in training
recurrent neural networks (RNNs) for learning fuzzy finite
automata. In the hybrid algorithm, the chromosomes are
evolved using one-step gradient descent with genetic
evolution. The hybrid algorithm is applied in learning
deterministic finite-state automata using recurrent neural
networks. The surprising results demonstrate that the hybrid
algorithm trains recurrent neural networks faster when
compared to training with regular genetic algorithm alone.

Attribution Non-Commercial (BY-NC)

100%(1)100% нашли этот документ полезным (1 голос)

319 просмотров7 страницIn this paper, we present a fast, hybrid gradient
descent and genetic algorithm for training recurrent neural
networks. The hybrid algorithm uses the strengths of genetic
algorithm and gradient descent learning in training
recurrent neural networks (RNNs) for learning fuzzy finite
automata. In the hybrid algorithm, the chromosomes are
evolved using one-step gradient descent with genetic
evolution. The hybrid algorithm is applied in learning
deterministic finite-state automata using recurrent neural
networks. The surprising results demonstrate that the hybrid
algorithm trains recurrent neural networks faster when
compared to training with regular genetic algorithm alone.

Attribution Non-Commercial (BY-NC)

Вы находитесь на странице: 1из 7

Rohitash Chandra1 and Christian W. Omlin2

1, 2

Department of Computer Engineering, Middle East Technical University,

Guzelyurt, Turkish Republic of Northern Cyprus.

Abstract - In this paper, we present a fast, hybrid gradient

gradient descent [2, 3], and particle swarm optimization with

descent and genetic algorithm for training recurrent neural

gradient descent [4].

networks. The hybrid algorithm uses the strengths of genetic

Combining gradient descent with an evolutionary algorithm

algorithm and gradient descent learning in training

for training neural networks allows parallel and continuous

recurrent neural networks (RNNs) for learning fuzzy finite

global and local search for a solution in weight space.

automata. In the hybrid algorithm, the chromosomes are

Gradient descent has been successfully embedded in genetic

evolved using one-step gradient descent with genetic

algorithms for image reconstruction [5]. Particle swarm

evolution. The hybrid algorithm is applied in learning

optimization (PSO) has also been combined with evolutionary

deterministic finite-state automata using recurrent neural

algorithms (EA) for training recurrent neural networks. It has

networks. The surprising results demonstrate that the hybrid

been shown that the hybrid PSO-EA outperforms standard EA

algorithm trains recurrent neural networks faster when

and PSO algorithms [6].

compared to training with regular genetic algorithm alone.

The strengths of gradient descent and genetic algorithms

have motivated us to develop a hybrid GA-GD learning

Keywords: Genetic algorithms, Recurrent neural networks,

algorithm which alleviates and exploits their respective

Hybrid algorithm, and Gradient descent.

weaknesses and strengths, respectively. In our hybrid

algorithm, gradient descent is embedded in a genetic

algorithm. Gradient descent is used to evolve the

chromosomes by updating weights with the error

1 Introduction backpropagation algorithm. The fitness of each chromosome

Although neural networks have performed very well in in the population is calculated after one weight update. The

many real-world application problems, training them can be fitness directly affects the selection of parent chromosomes

cumbersome in cases where the data contains noise and is for the crossover operator which combines components of

linearly inseparable. Traditionally, neural networks have been two parents into a single offspring. We use the roulette wheel

trained by the error back-propagation algorithm which method for probabilistically selecting parent chromosomes.

employs gradient descent for training. There is a tendency for Each gene in the offspring is then mutated according to the

gradient based learning algorithms to get trapped in local mutation probability. This hybrid algorithm differs from

minima resulting in poor training and generalization traditional genetic algorithms in that the evolution of weights

performance. To overcome this shortcoming, evolutionary is based on the gradient information and probabilistic

techniques such as genetic algorithms have been used in mutation rather than mutation alone. Mutation offers a

neural network training which alleviate the problem of local network the opportunity for an escape from a possible local

minimum. Genetic algorithms are evolutionary search minimum in weight space. In principle, gradient descent and

techniques; thus, training of neural networks using genetic genetic algorithms can be combined in two ways: the genetic

algorithms can be time-consuming. Performance comparisons evolution using crossover and mutation can be done prior to

of the two methods have shown that genetic algorithms gradient descent weight update, or gradient descent weight

generally outperform gradient descent in training feedforward update can be performed prior to genetic evolution. In fact,

neural networks for real-world application problems [1]. these two methods are equivalent: the first gradient descent

In recent years, training neural networks using hybrid weight optimization simply creates a new population; thus, we

algorithms have gained much interest. A common hybrid can think of that population as the initial random population

technique uses an evolutionary algorithm in the initial training of the hybrid algorithm that first performs crossover and

phase; once a certain number of training generations has been mutation before the gradient descent optimization. Thus, there

reached, the evolutionary search terminates and gradient is no need to separately consider the two hybrid algorithms.

descent training is used for final training. Thus, evolutionary In the remainder of this paper, we will only consider the

training is used to search the weight space globally for a hybrid algorithm which uses gradient descent weight update

promising solution and gradient descent refines that solution followed by crossover and mutation operations.

through local optimization. Examples of such hybrid

The remainder of the paper is organized as follows: In Z-1

Section II, we discuss recurrent neural networks, fuzzy finite

t-1

automaton, gradient descent and genetic algorithms for

training RNNs. In Section III, we discuss the framework of Contex

the hybrid GA-GD algorithm in detail. In Section IV, we t layer

show empirical results on how the hybrid GA-GD algorithm

outperforms traditional GAs in training RNNs on fuzzy finite

automaton. We then conclude the work and discuss the

Input

feasibility of future research according to the results. layer

wij Output

layer

Layer

Fig 1 First-order recurrent neural network

Recurrent neural networks have been an important focus of

research as they can be applied to difficult problems involving

time-varying patterns. Their applications range from speech

2.2 Finite-state automata for RNN training

recognition and financial prediction to gesture recognition Recurrent neural networks are appropriate tools for

[7]-[9]. They have the ability to provide good generalization modeling real world application problems of speech, signature

performance on unseen data but are difficult to train. and gesture recognition and stated earlier. However, these

Recurrent neural networks are dynamical systems and it has applications are not well suited for addressing their

shown been that they can represent deterministic finite-state fundamental issues such as training algorithms and

automata in their internal weight representations [10]. knowledge representation. These applications come with

Unlike feedforward neural networks, recurrent neural specific characteristics, for example, in application to speech

networks contain feedback connections. They are composed recognition feature extraction may be required which may

of an input layer, a context layer which provides state hinder the investigation of the networks fundamental issues.

information, a hidden layer and an output layer. Each layer Different applications require different feature extraction

contains one or more neurons which propagate information techniques. The models such as finite-state automata and their

from one layer to the next by computing a non-linear function corresponding languages can be viewed as a general paradigm

of their weighted sum of inputs. Recurrent neural networks of temporal, symbolic language. There is no feature extraction

maintain information about their past states for the necessary for recurrent neural networks to learn these

computation of future states and outputs. Popular languages. The knowledge acquired in recurrent neural

architectures of recurrent neural networks include first-order networks through learning well corresponds with the

recurrent networks [11], second-order recurrent networks dynamics of finite-state automata. The representation of

[12], NARX networks [13] and LSTM recurrent networks automata is a prerequisite for learning its corresponding

[14]. A detailed study about the vast variety of recurrent languages; i.e. if the architecture cannot represent a particular

neural networks is beyond the scope of this paper. Fig. 1 is a automaton then it would not be able to learn it either.

diagram for first order recurrent neural networks showing the Finite automata have been used for training recurrent neural

recurrence from the hidden to the context layer. The equation networks as they represent dynamical systems. They have also

of the dynamics of the change of hidden state neuron been used to study knowledge representation in recurrent

activations in first order recurrent neural network is given by neural networks and it has been demonstrated through

Equation 1. knowledge extraction that RNNs can represent dynamical

systems [15]-[17]. Finite-state automata are used as test beds

⎛ K J ⎞ for training recurrent neural networks. They have been

S i ( t ) = g ⎜ ∑ Vik S k ( t − 1) + ∑ W ij I j ( t − 1) ⎟ (1)

popular as they represent dynamical systems and the strings

⎝ k =1 j =1 ⎠ for training do not need to undergo any feature extraction.

An alphabet ∑ is a finite set of symbols. A formal language

where S k (t ) and I j (t) represent the output of the state neuron is a set of strings of symbols over some alphabet. Simple

and input neurons respectively. V ik and W ij represent their alphabets, e.g. ∑= {0, 1}, are typically considered in the

study of formal languages since results can easily be extended

corresponding weights. g(.) is a sigmoidal discriminant

to larger alphabets. The set of all strings of odd parity L = {ε,

function.

1, 01, 001, 011, 101 …} is an example of a simple language.

The symbol ε is used to denote a null string. The language

contains an infinite number of strings.

neural network in time so that it becomes a deep multilayer

feedforward network. This can be done by adding a layer for

each time step. When unfolded in time, the network has the

same behavior as a recurrent neural network for a finite

number of time steps. Gradient descent has the limitation of

learning longer term dependencies in recurrent neural

networks as the error gradient decreases significantly in

longer sequences [19]. The weight update in gradient descent

learning is computed by adding ∆wji to the respective weight

as shown in Equation 2:

∂Ed

∆wji = −α (2)

∂wji

where ∂Ed is the error on training example d, summed over

Fig. 2 A 7 state deterministic finite state automata. all output units in the network as shown in Equation 3.

1 m

A deterministic finite-state automata (DFA) is defined as a E d = ∑ ( d j − S Lj ) 2 (3)

5-tuple M = (Q, ∑, δ, q1 ,F ), where Q is a finite number of 2 j =1

states, ∑ is the input alphabet, δ is the next state function δ : Here d j is the desired output for neuron j in the output layer

Q × ∑ →Q which defines which state q’ = δ(q,σ) is reached which contains m neurons, and S Lj is the network output of

by an automaton after reading symbol σ when in state q, q1 Є

neuron j in the output layer L. Fig. 4 shows a high level

Q is the initial state of the automaton (before reading any

framework of the BPTT which employs gradient decent for

string) and F ⊆ Q is the set of accepting states of the

error back-propagation and weight update.

automaton. The language L(M) accepted by the automaton

contains all the strings that bring the automaton to an __________________________

accepting state. The languages accepted by DFAs are called

regular languages. Fig. 2 shows the DFA which will be used procedure: Gradient Descent for RNN training

for training RNN using the hybrid evolutionary one-step

algorithm. Double circles in the figure show accepting states initialize weights and biases

while rejecting states are shown by single circles. State 1 is while (termination condition is not satisfied) do

the automaton’s start state. The training and testing set is i) forward propagate

obtained upon the presentation of strings to this automaton

ii) back-propagate error through time

which gives an output i.e. a rejecting or accepting state

depending on the state where the last sequence of the string and do weight update

was presented. For example, the output of a string of length 7, end

i.e. 0100101 is in state 5 which is an accepting state, therefore load data for testing the RNN

the output is 1. ______________________________

2.3 Gradient descent algorithm for training Fig. 4 Description of BPTT for Weight Update

neural networks

Error backpropagation employs gradient descent 2.4 Genetic algorithms for training neural

learning and is the most popular algorithm used for training networks

neural networks. The goal of gradient descent learning is to

minimize the sum of squared errors by propagating error Genetic algorithms provide a learning method motivated

signals backward through the network architecture upon the by biological evolution [20]. They have been successfully

presentation of training samples from the training set. These applied to neural network weight updates and to network

error signals are used to calculate the weight updates which topology optimization [21]. In recent years, such hybrid

represent the knowledge learnt from training. A limitation of approaches to neural networks training have gained popularity

gradient descent learning is their tendency of getting trapped and have been applied to real-world problems such as job

in a local minimum during training resulting in poor training scheduling [22], forecasting [23] and robotics control [24].

and generalization performance. The general idea in using genetic algorithms for training

Backpropagation is used for training feedforward networks neural networks is to encode weights as chromosomes in a

while backpropagation-through-time (BPTT) is employed for population. The task of genetic algorithms then is to find

training recurrent neural networks. BPTT is the spatio- optimal sets of weights that best represent the knowledge after

temporal extension of the backpropagation algorithm [18]. being presented with the training data in the network. The

The general idea behind BPTT is to unfold the recurrent fitness function is thus the sum of squared errors returned by

the network after being presented with the weights encoded in

chromosomes. Genetic algorithms find the optimal set of

weights in a network topology which minimizes the error

function. To evaluate the fitness function, each weight

encoded in the chromosome is assigned to the respective

weight links of the network. The training set of examples is

then presented to the network which propagates the input

signals forward and the sum of squared errors is calculated.

In this way, genetic algorithms attempt to find a set of weights

which minimizes the error function of the network. Unlike

learning with gradient descent, genetic algorithms can help

neural networks to escape from the local minima in weight

space. Fig 5 shows a high level description of genetic

algorithms for training RNNs.

____________________________

Fig. 6 The crossover operator in genetic algorithm

initialize population

evaluate RNN’s fitness

while (termination condition is not satisfied) do 3 Hybrid Training Algorithm

i) crossover and mutation

The strengths and weaknesses of gradient descent and

ii) update population

genetic algorithms have been discussed in the previous

iii) evaluate RNN’s fitness sections. While genetic algorithms have shown to overcome

end the problem of local minima, their drawback is evolutionary

get the optimal chromosome of weights optimization which can be time consuming. The evolution of

load data for testing the RNN weights based can also temporarily direct the network away

________________________________ from the optimal solution.

The update of weight and biases in order to minimize the

Fig. 5 Genetic algorithms for evolution of RNN weights

fitness function is common in both gradient decent and

genetic algorithms. In the hybrid approach, the gradient

information is used in creating a new population on which

The weights are encoded into chromosomes using either genetic operators such as crossover and mutation are applied.

binary or real numbered weight encoding schemes. In binary Genetic operators combine components from two different

encoding, a set of genes correspond to a certain weight link solutions according to the respective selection criterion and

[25,26]. The genes are changed into real weight values before therefore, a better solution can be obtained. The description of

being decoded into their respective weight links in order to the proposed algorithm is shown in Fig. 7.

evaluate the fitness function. Real number encodings are an In the description of the algorithm in Fig 7, a population of

alternate approach [27]. In order to use this method, genetic hypothesis which represents weights as chromosome is

operators must be changed as traditional genetic operators are initialized with small random real values. Each chromosome

specifically designed for binary chromosomes. One way of is presented to the network where they are updated using

altering the genetic operators is as follows: the crossover gradient descent which calculates the weight update according

operator takes two parent chromosomes and creates a single to the error from the input to output mappings of the training

child chromosome by randomly selecting corresponding samples. This weight update is done for one epoch only.

genetic materials from both parents as shown in Fig. 6. The Each updated network then becomes part of the new

mutation operator adds a small random real number in the population. Each chromosome in the new population is

range of -1 and 1 to each gene in the offspring according to evaluated according to the fitness function which is the

the mutation probability. reciprocal of the objective function (e.g. the sum of squared

error returned by the network). If the termination condition is

not satisfied, then the algorithm proceeds with genetic

evolution using crossover and mutation: (1) two parents are

chosen by the respective selection criterion such as rank,

roulette wheel or tournament selection, (2) an offspring is

created from the components of each parent using the

crossover operator according to the crossover probability.

Then each gene in the chromosome is altered by adding a Note that we used the sum of squared error from the

small random number according to the mutation probability. network as the fitness function. The crossover operator

chooses two parents using roulette wheel selection and creates

a child chromosome by probabilistically selecting genes from

__________________________ each parent. The mutation operator adds a small real random

number in the range of [-1,1] to each gene in the

procedure: Hybrid Training Algorithm for RNN chromosome. The maximum number of training time allowed

was 1000 generations. We used 8 neurons in the hidden layer

initialize population as it showed successful results for representing a 7 state DFA

evaluate RNN’s fitness in trail experiments. The results for training RNN using the

while (termination condition is not reached) do hybrid algorithm is shown in Table 1. Table 2 shows the

i) crossover and mutation results for training using a standard GA.

(Genetic evolution) The results clearly demonstrate that our hybrid algorithm

outperformed training of RNNs with a genetic algorithm

ii) present each chromosome to RNN using

alone in terms of training time. The training time has been

GD weight update for 1 epoch only widely affected by the different combination of the crossover

(neural weight update) and mutation probabilities. The results are promising which

iii) update population motivates the application of the hybrid training algorithm in

iv) evaluate RNN’s fitness training feedforward neural networks. The contribution of our

end hybrid training algorithm to solving real-world problems

get the optimal chromosome of weights looks promising.

load data for testing the RNN

________________________________ TABLE 1: Hybrid Training Algorithm for RNN

Fig. 7 Description of the proposed hybrid genetic Mutation Crossover

Accuracy Accuracy time

algorithm/gradient descent training algorithm

0.5 0.9 100±0% 100±0% 2.7±1.2

0.9 0.9 100±0% 100±0% 4.2±2.0

0.5 0.5 100±0% 100±0% 3.3±0.9

4 Results and Discussion 0.9 0.5 100±0% 100±0% 3.5±0.7

In the following results, we show the performance of

training of recurrent neural networks with the genetic The 90 percent confidence interval for 10 experiments done with

algorithm alone and our hybrid training method, respectively. different values of crossover and mutation is given .The training

In both cases, we randomly initialise all the genes in the time is given by the number of ‘generations’. The maximum

chromosomes in the range of [-1, 1]. From trial experiments, training time allowed was 1000 generations.

we determined a population size of 40 to give the best results;

therefore, this population size is used in all the experiments.

We used different combinations of crossover and mutation

probabilities of 0.9 and 0.5. For each combination of different TABLE 2: GA Training for RNN

crossover and mutation probabilities, we ran 10 experiments.

Training Generalization Training

In the implementation of the genetic algorithm, which evolves Mutation Crossover

Accuracy Accuracy time

real numbered weight values from the network, the optimal 0.5 0.9 100±0% 100±0% 118.9±44.1

probabilities of crossover and mutation are important for rapid 0.9 0.9 100±0% 100±0% 63.2±23.4

convergence to a solution. To understand the genetic training 0.5 0.5 100±0% 100±0% 86.4±21.2

process for neural networks, one has to consider that the 0.9 0.5 100±0% 100±0% 77.9±15.6

actual learning takes place during mutation where there is a

significant change in the weight values. The crossover

operator does not alter the value of the weights in any way; it The 90 percent confidence interval for 10 experiments done with

only exchanges them with its respective selected parent. different values of crossover and mutation is given .The training

When using real-valued genetic weight representation, time is given by the number of ‘generations’. The maximum

mutation is thus more significant for the learning. Therefore, training time allowed was 1000 generations.

we ran experiments to find out the optimal probabilities for

crossover and mutation. We used the following hybrid weight

update strategy: we construct a RNN from a chromosome,

perform weight update for one epoch only, and then apply

probabilistic crossover and mutation.,

5 Conclusion Proc. of the IEEE/IAFE Computational Intelligence for

Financial Engineering, New York City, USA, 1997, pp.

We have presented a simple hybrid algorithm for training 253-259

recurrent neural networks using a combination of gradient [9] K. Marakami and H Taguchi, Gesture recognition using

descent and genetic algorithm weight update. Each recurrent neural networks, Proc. of the SIGCHI

chromosome in the population is modified using one step of conference on Human factors in computing systems:

gradient descent optimization followed by the application Reaching through technology, Louisiana, USA, 1991,

standard crossover and mutation operators. Surprisingly, we pp. 237-242.

have found that this single gradient descent step makes the [10] C. Lee Giles, C.W Omlin and K. Thornber, “Equivalence

difference between rapid convergence and non-convergence in Knowledge Representation: Automata, Recurrent

within 1000 generations when applied to the problem of Neural Networks, and dynamical Systems”, Proc. of the

training recurrent neural networks to behave like deterministic IEEE, vol. 87, no. 9, 1999, pp.1623-1640.

finite- state automata. It would be interesting to see the [11] P. Manolios and R. Fanelli, First order recurrent neural

performance of the Hybrid training algorithm on fuzzy finite- networks and deterministic finite state automata. Neural

state automata in future works. The contribution of our hybrid Computation, vol. 6, no. 6, 1994, pp.1154-1172.

training algorithm to solving real-world problems looks [12] R. L. Watrous and G. M. Kuhn, Induction of finite-state

promising. languages using second-order recurrent networks, Proc.

of Advances in Neural Information Systems, California,

USA, 1992, pp. 309-316.

6 References [13] T. Lin, B.G. Horne, P. Tino, & C.L. Giles, Learning long-

term dependencies in NARX recurrent neural networks.

IEEE Transactions on Neural Networks, vol. 7, no. 6,

[1] Randall S. Sexton, Robert E. Dorsey, “Reliable 1996, pp. 1329-1338.

classification using neural networks: a genetic algorithm [14] S. Hochreiter and J. Schmidhuber, Long short-term

and backpropagation comparison”, Decision Support memory, Neural Computation, vol. 9, no. 8, 1997, pp.

Systems, 30, 2000, 11-22. 1735-1780.

[2] Rohitash Chandra, Christian. W. Omlin, “Combining [15] C. W. Omlin and C. Giles, “Constructing deterministic

Genetic and Gradient Descent Learning in Recurrent finite-state automata in recurrent neural networks”,

Neural Networks: An Application to Speech Phoneme Journal of the ACM, vol. 43, no. 6, 1996, pp. 937-972.

Classification” Proceedings of the International [16] C. W. Omlin, K. K. Thornber, & C. L. Giles, Fuzzy

Conference on Artificial Intelligence and Pattern finite state automata can be deterministically encoded

Recognition, Orlando FL, USA, July 2007, pp. 278-285. into recurrent neural networks, IEEE Trans. Fuzzy Syst.,

[3] Lixin Lu and Yan-Qing Zhang, “Evolutionary Fuzzy 6, 1998, 76–89.

neural networks for Hybrid Financial Prediction”, IEEE [17] R. L. Watrous and G. M. Kuhn, “Induction of finite-state

Transactions of systems, man and cybernetics-Part C: languages using second-order recurrent networks,” Proc.

Applications and Reviews, Vol 35, No. 2, May 2005, pp of Advances in Neural Information Systems, California,

244-249. USA, 1992, pp. 309-316.

[4] Jing-Ru Zang, Jun Zhang, Tat-Ming Lok, Michael R. [18] P. J. Werbos, “Backpropagation through time: what it

Lyu, “A hybrid particle swarm optimization- does and how to do it,” Proc. of the IEEE, vol. 78, no.

backpropagation algorithm for feedforward neural 10, 1990, pp.1550-1560.

network training,” Applied Mathematics and [19] Y. Bengio, P. Simard and P. Frasconi, “Learning long-

Computation, 185, 2007, 1026-1037. term dependencies with gradient descent is difficult,”

[5] Liu Mei, Liu WeiDong, Sun DeQing, Chen Guqiao and IEEE Transactions on Neural Networks, vol. 5, no. 2,

Liu Huinian, “A new super-resolution image 1994, pp. 157-166.

reconstruction method based on hybrid genetic [20] J. H. Holland, “Genetic Algorithms and the Optimal

algorithm”, Proceedings of the 2004 IEEE International Allocation of Trials”, SIAM Journal of Computing, vol.

Conference on Control Applications, Taipei, Taiwan, 2, no. 2, 1973, pp. 88-105.

2004, pp. 211-216. [21] J. H. Ang, K. C. Tan, A. Al-Mamun,” Training neural

[6] Xindi Cai, Nian Zhang, Ganesh K. Venayagamoorthy, networks for classification using growth probability-

Donald C. Wunsch II, “Time series prediction with based evolution”, Neurocomputing, 2008,

recurrent neural networks trained by a hybrid PSO-EA doi:10.1016/j.neucom.2007.10.011

algorithm”, Neurocomputing, 70, 2007, 2342-2353. [22] Haibin Yu, Wei Liang, “Neural Network and genetic

[7] A.J Robinson, An application of recurrent nets to phone algorithm based hybrid approach o extended job-

probability estimation, IEEE transactions on Neural scheduling”, Computers and Industrial Engineering, 39.

Networks, vol.5, no.2 , 1994, pp. 298-305. 2001, 337-356.

[8] C.L. Giles, S. Lawrence and A.C. Tsoi, Rule inference [23] Harri Niska, Teri Hiltunen, Ari Karppinen, Juhani

for financial prediction using recurrent neural networks, Ruuskanen, Mikko Kolehmainen, “Evolving the neural

network model for forecasting air pollution time series”,

Engineering Applications of Artificial Intelligence, 17,

2004, 159-167.

[24] Genci Capi, Kenji Doya, “Evolution of recurrent neural

controllers using an extended parallel genetic algorithm”,

Robotic and Autonomous Systems, 52, 2005, 148-159.

[25] P. J. Angeline, G. M. Sauders, and J. B. Pollack, An

evolutionary algorithm that constructs recurrent neural

networks, IEEE Transactions on Neural Networks, vol. 5,

1994, pp. 54-65.

[26] M. A. Potter and D. Jong, “Evolving neural networks

with collaborative species”, Proc. of the Summer

Computer Simulation Conference, 1995.

[27] M. Negnevitsky, Artificial Intelligence: A Guide to

Intelligence Systems, Addison Wesley, 2004.

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.