Академический Документы
Профессиональный Документы
Культура Документы
System *
Abstract- This paper tries to demonstrate how restricted to a few fixed functional types. An extreme
a heuristic neural control approach can be used case is that the control can only assume two different con-
to eolve a complex nonlinear control problem. As stant values. Control problems of this kind can usually be
well as to swing up the pendulum, the controller is translated into finite decision problems that can then be
also required to bring the cart back to the origin solved using techniques such as simple neural nets [l].
of the track. Through the solution of this spe- The idea of translating control problem into decision
cific control problem, we try to illustrate a heuris- problem can also be extended beyond the classes men-
tic neural control approach with task decomposi- tioned above. In fact, if the state-space of a plant is parti-
tion, control rule extraction and neural net rule tioned into finite number of regions, a continuous control
implementation as its basic elements. Specializ- action defined over the state-space can, under some cir-
ing to the pendulum problem, the global control cumstances, be approximated by a constant in each of
task is decomposed into sub-tasks, namely, pendu- the regions. If we further confine the input to have only a
lum positioning and cart positioning. Accordingly, finite number of levels (as in the case of finite control hori-
three separate neural sub-controllersare designed zon [5]), then the control problem can readily be treated
to cater to the sub-tasks and their coordination. as a finite decision problem. Observe that this idea is also
The simulation result is provided to show the ac- the principle behind the so-called fuzzy control approach
tual performance of the controller. [7],though the state-space partition in fuzzy control does
not generate regions with crisp boundaries.
I. INTRODUCTION An obvious advantage of this decision approach to con-
trol is that controller can be designed in a model-free fash-
Many industrial control problems can simply be phrased ion. This is in contrast with the traditional control design
as choosing right inputs to bring the state of a plant to methodology where model of a plant is essential to the
some desired position in an appropriate state space. A control design. Originated by Widrow [2], the decision ap-
typical example is slabilzzaiion, in which the desired con-
proach points to an interesting direction in handling the
trol action is to bring the state of a plant to zero by design-
classes of difficult nonlinear control problems. Other pio-
ing appropriate control inputs. Another example is the
neering works along the same line can be found in (3][4],
shift of set-point. In this case, control inputs are sought though these works typically follow a stochastic rather
in order to bring the state of a plant from a previous set- than deterministic formulation.
point to the intended one. Control problems as such can
The translated decision problems are, after all, the
be extremely involved even for relatively simple nonlinear problems of pattern recognition, which can be approached
systems, no general methodology is available for actually
with many different conventional techniques. Noticeable
carrying out the design.
for its relative versatility among many of these techniques
There exist, however, certain classes of nonlinear con- is the so-called neural net approach. Neural nets can have
trol problems for which the allowable control inputs are many different varieties, and a widely-appreciated family
'This research work WM partially carried out at the ESAT labo- of neural nets is the layered feedforward net with bipolar
ratory of the Katholieke Universiteit Leuven, in the framework of a neurons.' This structure is called Madaline by Widrow [l],
Concerted Action Project of the Flemish Community, entitled Ap-
plicable Neural Networkr. The rdentific responsibility is assumed lBipolar neurons are the neurons with the output of either +1
by its authors. or -1. A neural net with bipolar neurons will simply be referred to
control
-
expert
c
+F ---i) -F
neural
I 1 I
-h x=o x x,
Figure 1: The scheme of the rule based controller design Figure 2: The cart-pendulum system, the angle and angle
speed are positive in clockwise direction.
535
control problem is linear in nature, thus does not really allow the linearization of the mathematical model. More-
serve to illustrate the strength of the neural controllers over, there seems to be no nonlinear control technique that
that are supposed to be apt at handling hard nonlinear solves the problem in a systematic manner. It is obvious
control problems. In fact, the balancing problem can be that a complex nonlinear model even to the extreme detail
solved more elegantly by using techniques developed in may, in a lot of cases, still be unable to help to determin-
linear control theory. ing the appropriate control2. This is where a model-free
To come up with manageable rules, the global control control approach (heuristic approach) in a sub-class is pre-
task is decomposed into sub-tasks of pendulum positioning ferred.
and cart positioning. Accordingly, three separate neural To motivate our model-free heuristic design approach,
sub-controllers are designed to cater to the sub-tasks and let us first analyze how a human operator would try to
their coordination. These are pendulum sub-controller accomplish the desired control objective. Naturally, the
(PSC),cart sub-controller (CSC) and the switching sub- first step towards the control involves a trial-and-error
controller (SSC). Each of the sub-controllers is designed process of getting a feel of the system. This is a process of
based on the rules and guidelines obtained from the expe- rule extraction. The extracted rules are situation-action
riences of a human operator. Simulation analysis is also in nature and completely different in nature from the pure
carried out to show the performance of the neural con- mathematical modelling of the dynamics of the system.
troller. After getting a feel of the system, the next step the hu-
man operator is likely to take is to identify that the pendu-
11. THEPLANTDYNAMICS AND PROBLEM
lum swing-up operation and the cart-positioning can actu-
DECOMPOSITION
ally be viewed as two separable sub-control objectives. To
Though the mathematical model of the inverted pendu- meet the final control demand, he only has to first swing-
lum system is not relevant to the design of our controller, up the pendulum regardless of the cart disposition (within
we nonetheless write it down here to show the nonlinear- the pre-set limits), then repositioning the cart back to the
ity inherent in the system, and set the notations. With origin of the track while balancing the pendulum. The
appropriate setting of the parameters, these equations are later two sub-problems are apparently considerably man-
also used to generate the next state based on the current ageable compared to the original one.
input and state of the system, which is the basis of our In fact, by playing with a computer emulation program
simulation and design. of the inverted pendulum system, the authors have gone
The inverted pendulum system shown in Figure 2 con- through exactly the same steps described above to reach
sists of a cart of mass m and a pendulum of mass m, the control objective. It follows from our own experiences
centered at the half length 1. The constant force F is ap- that trying to simultaneously take care of swinging-up and
plied in both horizontal directions through the mass center the repositioning proves to be extremely difficult. This
of the cart so as to move it up and down the track. The can perhaps be interpreted in terms of trajectory flow in
track is delimited at both ends as shown in Figure 1 by the state-space composed of states 8,8,z, k. The feasible
kX,,, . According to physics, the system is modeled by the trajectory may first wander far away from the origin before
following set of differential equations it eventually comes back. Though the decomposibility of
a complex problem seems generically possible, the actual
process of decomposition is not as simple at all as shown in
this example. It may involve analysis, experiences through
trial and error, human intuition and generalization.
.. F + m/[$sin 8 - ii COS 01 The actual process of decomposition and rule extraction
t =
m, m + (2)
for a particular system is influenced by the nature of the
where 6 and z are respectively, the angular and the cart underlying plant as well as the control requirements. For
displacements; g = 9.8m/s2 is the acceleration due to instance, the decomposition and rules will be different if
gravity. To be specific, the parameters are chosen as an additional requirement of time-optimality is required.
F = 10N,m, = l.Okg, m = O.lkg, and 1 = 0.5m for Although the task of decomposition is often explicitly
the purpose of controller design. Note that the friction realized and attempted by a human operator, the aspects
between the cart and the track, and the one at the hinge of rules he comes up to deal with each sub-problem and co-
between the pendulum and the cart are all neglected in ordination of the rules for different sub-problems are less
the above equations to make them look less formidable.
Obviously, even start with the known Equations (1) and 21n the present paper, the model refers, specifically, to the math-
ematical equations of the systems. Thus, a set of heuristic rules
(2), the control does not seem to follow from any exist- describing the behaviour of a system is not considered aa the system
ing linear techniques as the control problem itself will not model.
536
stant force along a single direction (it may become possible
F
plant e e X i if the magnitude of the force is extremely large). We will
have 40 switch the direction of the force at certain point
F
(say, 8 = p ) of the state space. If the specified magnitude
of F is too small, more than one such force switching are
needed. For our case ( F = lo), one switching is enough if
ubcontrolle p is suitably chosen.
switching The behavior of the system under such control input
sequence is as follows. When +F is applied, the pendulum
(starting from initial position of 8 = a, 8 = 0) swings up
to a larger 8 on the side of 8 > .rr with 8 > 0, until 8
-
decrease to zero (at which point 8 2?r < -a)..Then the
pendulum begins to fall down with a negative 8 . When 8
Figure 3: The overall control scheme becomes equal to p , the force changes to -F resulting in
the pendulum falling down with an increasing negative 8.
When the pendulum swjngs back to the point 8 = a,there
explicit. For example, while playing with the emulation will still be a negative 0 (note that without changing the
program, we can not substantiate our instant decisions direction of F from +F to -F at 8 = p , 8 will be zero at
that eventually contribute to an eventual success. This this point). By continuously applying -F,the pendulum
tacitness of the rules is often the most difficult aspect of will swing up to the point where 8 < CY on the side of
all rulebased approaches. The degree of difficulty in ob- 0 < lr.
taining rules varies for different control problems. In the Following the preceding analysis, it can be readily seen
following section, we shall work out the appropriate rules that the choices of Q and p are vital to the success of
for the inverted pendulum system to hint the general prin- this sub-controller. The improper choices of CY and p will
ciples we normally follow. result in either being unable to swing the pendulum up
111. RULEEXTRACTION
AND NEURAL
NET with a single change of forces or the oscillation between
IMPLEMENTATION the swing-up and the stabilization actions. Several trial-
and-error processes indicate that one of the proper choices
We now discuss how the control rules are extracted and is a = 0.2 and p = -1.
implemented by simple feedforward and feedback neural It is relative simple for the design of the stabilizing part
nets. of PSC. Based on our intuition, it ,is easy to see that we
As discussed before, the two sub-problems, namely, should apply,+F when 8 > 0 and 8 > 0, apply -F when
swinging-up the pendulum and repositioning the cart to 0 0 and 8 c 0, and finally, apply +F or -F when
the origin, will be separately dealt with using two sub- 0 . 8 c 0 depending on the ratio K = 8 / 8 ( K will be chosen
controllers, Pendulum Sub-controller (PSC) and Cart to be -15).
Sub-Controller (CSC). The actions of the two sub- To summarize, the rules for the design of PSC are given
controllers are coordinated by the third sub-controller below:
called Switching Sub-controller (SSC). The overall de-
composed control scheme can be shown in Figure 3. The Rules for PSC:
rule extraction processes and neural net rule implemen-
tation for all the three sub-controllers are detailed in the 0 +F when 4 > p , otherwise, -F (swinging up);
subsequent sub-sections. +F when KO + > 0, otherwise, -F (stabilization);
A . Pendulum Sub-Controller change from swing-up to stabilization when -CY <
The sub-controller for swinging-up and stabilizing the 0 < CY.
pendulum is viewed as consisting of three functional parts: From the geometric theory of neural networks [ll], we
Pendulum swinging-up, stabilization and switching from know that these rules can easily be implemented in a two
swinging-up action to stabilization. layer multilayer perceptron neural network, which is show
The switching condition between the two control actions in the upper part of Figure 4.
is a prescribed small angular displacement a (a> 0), i.e.,
whenever -CY < 8 < a,the switching will take place. B. Cart Sub-Controller
Experience shows that it is impossible to swing up the The position control of the cart is quite simple. The di-
pendulum to the range of -a < 0 < a by applying con- rection of the applied force is determined by z and i,the
537
*
I
B
42
a
.l
m
3
I
4.1' I
0 1 2 1.5 1.55 1.6
samples XlW sampls XlW
4 (d)
6 0.2 I
- 4
I
!
The whole neural controller with SSC high- 3
-
Figure 4: 2
lighted. a
r 0-
8
-2 4.2
position and the speed of the cart. When x > 0 and x > 0, 0 1 2 1.5 1.55 1.6
-F should be applied; when x < 0 and x < 0, +F should samples XlW samples x104
be applied; when x . x < 0, the choice of +F or -F de-
pends on the ratio p = x/x. Based on the experience, p Figure 5: The result of simulation. Refer to the text for
has been chosen to be -0.5. So the rules for CSC are the a complete explanation.
following.
Rule for CSC: angle but greater than p), the SSC will switch the control
to the PSC which will bring the pendulum back t o the
e +F when px + x < 0, otherwise, - F . zero state and also move the cart toward the origin. This
The neural network implementation of this sub- switching process will continue until the cart reaches the
controller is shown in the lower part of Figure 4. origin of the track, after which both the pendulum and
the cart will swing back and forth within a small range of
C. Switching Sub-Controller zero state. Let us summarize the rules for SSC as follows.
Although the two sub-controllers (PSC and CSC) de-
signed above are able to carry out separate pendulum Rules for SSC:
swing-up and cart positioning control actions, they need to
be combined dynamically by the switching sub-controller e switch to CSC when -p < B < p and -v < 4 < U ;
(SSC) t o finally realize the global control objective. The e switch back to PSC when B < -7 or 0 < -7.
idea of switching between the two sub-controllers is based
on the following simple fact: When the pendulum is within Because of the dynamical nature of this switching mech-
a very small range of zero state (-p < B < p, -v < 0 < v , anism, we have used time-delay MLP (a kind of simple
where p , v are small positive values), it will need more feedback neural nets) for the implementation of SSC. To-
reverse forces to bring the pendulum back to this range gether with PSC and CSC, the SSC is shown in Fig-
than to push it away. This phenomenon can be roughly ure 4 as part of the whole neural controller. In this
explained by the torque applied on the pendulum, i.e., realization the parameters have been chosen as follows,
the smaller 101 is, the larger this torque will be when F p = 0.004, v = 0.2 and 7 = 0.03.
remains the same.
Based on the above phenomenon, we will design the IV. SIMULATION
SSC to work in the following way: Whenever -p < B < p The simulation of the neural controller for the inverted
and -v < 8 < v , switch the control to the CSC which pendulum system is done on a DECstation using the soft-
will generate a sequence of identical forces ( - F or +F ware package Matlab. The sampling period is 0.01s and
depending on the state of the cart); the actions of CSC a typical simulation lasts for more than 24000 samples,
will cause the pendulum t o leave the small region around which is equivalent to 4 minutes in real time. One of the
zero state till the point that B > y or B < -y (y is a small simulation results are shown in Figure 5.
538
- .. -. ... . . _. . .. .... ..
In Figure 5(a), the angular displacement of the pendu- traction. Moreover, the approach does not intend to solve
lum is shown over the whole simulation process, while 5(b) all complex nonlinear control problems. What it really at-
shows a zoom-in picture for the samples from 15000 to tempts to solve is the nonlinear control problems that are
16000. Figure 5(c) shows the whole evolution process of less well-defined, but nevertheless seem intuitively solv-
the cart position, while 5(d) gives a detailed description of able.
the process for samples from 15000 to 16000. It is easy to
see that the stabilization of cart is achieved after the sta- REFERENCES
bilization of pendulum and the cart keeps moving within [l] B. Widrow (1987), The original adaptive neural
a small range of track origin after a short period of time. net broom-balancer, Int. Symp. Circuits and Syst.,
pp.351-357.
v. DISCUSSIONS A N D CONCLUSIONS
We have presented a heuristic neural control approach for [2] B. Widrow (1962), Reliable, trainable networks
solving nonlinear control problems. This approach is de- for computing and control, Aerospace Engineering,
cision based, and has task decomposition, rule extraction September , pp .78- 123.
and neural net rule implementation as its essential ele- [3] K. Fu (1970)) Learning control systems - Review
ments. Via the specific control problem of the inverted and outlook, IEEE ?Fans. Autom. Control, Vol.AC-
pendulum, we have demonstrated how each of the three 15, pp.210-221.
steps is actually considered and realized.
It may be of interests to compare this approach to other [4] G. N. Saridis (1981), Application of pattern recogni-
rule-based control approaches such as fuzzy control and tion methods t o control systems, IEEE ?Fans. Au-
expert control. It turns out that the rule-based neural con- t om. Control, Vol . AC-26, pp .638-645.
trol has following important advantages. First, it is able
to implement dynamical decisions and rules than just the [5] D. W. Clarke, C. Mohtad and P. S. Tuffs (19871,
static mapping actions. More precisely, in our feedback Generalized predictive control - Part. I, The ba-
neural controller, the outcome of a sub-decision process sic algorithm, Azltomatica, Vo1.23, pp.137-148.
or rule can be dependent on the previous outcome of its D. E. Rumelhart, G. E. Hinton, and R.J. Williams
own. This dynamical decision process, lacking in both (1986), Learning internal representation by er-
fuzzy and expert control approaches, strengthens consid- ror propagation, In Parallel Distributed Processing,
erably the decision ability, and sometimes proves to be Vol.1, Ch. 8, by D . E. Rumelhart and J. McClelland,
indispensable for the decision control problems. In fact, Cambridge MA: MIT Press.
it is the problem decomposition that necessitates the dy-
namical decision process. And decomposition is often con- M. Sugeno (1985), An introductory survey of fuzzy
sidered to be a key to complex control problems. control, Inform. Sci., Vo1.36, pp.59-83.
As a fringe benefit, dynamic decision requires less num-
ber of rules than a pure static decision. This is easily A. G. Barto, R. S. Sutton and C. W. Anderson
understood as in the static case, time completely unfolds (1983), Neuronlike adaptive elements that can solve
resulting in the need to cover all the time aspects of the difficult learning control problems, IEEE ?Fans.
decision. System, Man and Cybernetics, VoLSMC-13, pp.834-
Secondly, once constructed, the rule-based neural con- 846.
troller is a deterministic nonlinear dynamical system (with V. V. Tolat and B. Widrow (1988), An adaptive
local feedback only), thus allowing the analysis to be car- broom balancer with visual inputs, Proc. IEEE In-
ried out for performance evaluation. This is in contrast to tern. Conf. on Neural Net., San Diego, V01.2, pp.641-
both fuzzy and expert controls, where their final linguistic 647.
constructs prevent an analytic evaluation.
Finally, because of its parallel nature the rule-based C. W. Anderson (1989))Learning to control an in-
neural controller is computationally advantageous over verted pendulum using neural networks, IEEE Con-
fuzzy and expert control. The computational advantage trol Systems Magazine, Vol.9, pp.31-37.
also follows from the reduced number of rules as a result
of dynamical decision process. J . Hao, S. Tan and J. Vandewalle (1990), A geomet-
Like all other rule-based approaches, the rule-based ric approach to the structural synthesis of multilayer
neural controller is inherently non-analytical and impre- perceptron neural networks, Proc. INNC-90 Paris,
cise. The ambiguity follows from the heuristic and sub- Vol. 2, p p .881-885.
jective nature of the problem decomposition and rule ex-
539