Вы находитесь на странице: 1из 280

Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics by Neil Holden Getz

B.S. (Columbia University) 1987 B.F.A. (California College of Arts and Crafts) 1975

A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Sciences in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA at BERKELEY

Committee in charge:
Professor Jerrold E. Marsden, Chair Professor Charles A. Desoer Professor Andrew K. Packard

1995

The dissertation of Neil Holden Getz is approved:

Chair

Date

Date

Date

University of California at Berkeley

1995

Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics

Copyright 1995 by Neil Holden Getz

Abstract

Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics by Neil Holden Getz Doctor of Philosophy in Electrical Engineering and Computer Sciences University of California at Berkeley Professor Jerrold E. Marsden, Chair
This dissertation introduces the notion of a dynamic inverse of a nonlinear map. The dynamic inverse is used in the construction of nonlinear dynamical system, called a dynamic inverter, that asymptotically solves inverse problems with time-varying vector-valued solutions. Dynamic inversion generalizes and extends many previous results on the inversion of maps using continuous-time dynamic systems. By posing the dynamic inverse itself as the solution to an inverse problem, we show how one may solve for a dynamic inverse dynamically while simultaneously using the dynamic inverse solution to solve for the time-varying root of interest. Dynamic inversion is a continuous-time dynamic computational paradigm that may be incorporated into controllers in order to continuously provide estimates of time-varying parameters necessary for control. This allows nonlinear control systems to be posed entirely in continuous-time, replacing discrete root-nding algorithms as well as discrete algorithms for matrix inversion with integration. Example applications include solving for the intersection of time-varying polynomials, inversion of nonlinear control systems, regular and generalized inversion of xed and time-varying matrices, polar decomposition of xed and time-varying matrices, output tracking of implicitly dened reference trajectories, end-eector tracking control for robotic manipulators, and causal approximate output tracking for nonlinear nonminimum-phase systems. For the problem of output tracking for nonminimum-phase systems, an internal equilibrium manifold is introduced. This manifold is intrinsic to the class of nonlinear nonminimum-phase systems studied. Approximate output tracking is achieved by constructing a controller that makes a neighborhood of the

ii internal equilibrium manifold attractive and invariant. Dynamic inversion is incorporated into the controller to provide a continuous estimate of the manifold location. This estimate is incorporated into the tracking control law. We demonstrate, by application to the tracking problem for the inverted pendulum on a cart, that the resulting internal equilibrium controller signicantly outperforms a linear quadratic regulator, where the linearization of the internal equilibrium controller is made identical to the linear quadratic regulator. We also apply internal equilibrium control to the problem of causing a nonlinear, nonholonomic model of a bicycle to track a time-parameterized trajectory in the ground plane while retaining balance.

Professor Jerrold E. Marsden Dissertation Committee Chair

For Elise

iv

Contents
List of Figures List of Tables 1 Introduction 1.1 Motivation . . . . . . . . . . . . 1.2 Dynamic Inversion . . . . . . . . 1.3 Contributions of this Dissertation 1.4 Overview of the Thesis . . . . . . viii xiv 1 1 2 3 4 7 7 7 13 15 15 16 24 24 32 37 48 50 52 52 52 53 54 55 56 57 58 60

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

2 Dynamic Inversion of Nonlinear Maps 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 An Informal Introduction to Dynamic Inversion 2.1.2 Previous Work . . . . . . . . . . . . . . . . . . 2.1.3 Main Results . . . . . . . . . . . . . . . . . . . 2.1.4 Chapter Overview . . . . . . . . . . . . . . . . 2.2 A Dynamic Inverse . . . . . . . . . . . . . . . . . . . . 2.3 Dynamic Inversion . . . . . . . . . . . . . . . . . . . . 2.3.1 Dynamic Inversion with Bounded Error . . . . 2.3.2 Dynamic Inversion with Vanishing Error . . . . 2.4 Dynamic Estimation of a Dynamic Inverse . . . . . . . 2.5 Generalizations of Dynamic Inversion . . . . . . . . . . 2.6 Chapter Summary . . . . . . . . . . . . . . . . . . . .

3 Dynamic Methods for Polar Decomposition and Inversion of Matrices 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Inverting Time-Varying Matrices . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Left and Right Inversion of Time-Varying Matrices . . . . . . . . . . 3.3 Inversion of Constant Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 A Comment on Gradient Methods . . . . . . . . . . . . . . . . . . . 3.3.2 Dynamic Inversion of Constant Matrices by a Prescribed Time . . .

v 3.4 Polar Decomposition for Time-Varying Matrices . . . . 3.4.1 The Lyapunov Map . . . . . . . . . . . . . . . . 3.4.2 Dynamic Polar Decomposition . . . . . . . . . . Polar Decomposition and Inversion of Constant Matrices Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 68 69 75 80 81 81 82 83 83 84 84 84 86 87 88 94 95 95 96 97 99 99 105 106 111 112 112 112 114 114 115 116 120 121 125 130 133

3.5 3.6

4 Tracking Implicit Trajectories 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Motivation . . . . . . . . . . . . . . . . . . . . 4.1.2 Previous Work . . . . . . . . . . . . . . . . . . 4.1.3 Main Results . . . . . . . . . . . . . . . . . . . 4.1.4 Chapter Overview . . . . . . . . . . . . . . . . 4.2 Problem Denition . . . . . . . . . . . . . . . . . . . . 4.2.1 System Structure . . . . . . . . . . . . . . . . . 4.2.2 Internal Dynamics . . . . . . . . . . . . . . . . 4.2.3 The Output Space . . . . . . . . . . . . . . . . 4.2.4 Ouput-Bounded Internal Dynamics . . . . . . . 4.2.5 The Problem . . . . . . . . . . . . . . . . . . . 4.3 Tracking Control . . . . . . . . . . . . . . . . . . . . . 4.3.1 Tracking Explicit Trajectories . . . . . . . . . . 4.3.2 Estimating the Implicit Reference Trajectory . 4.3.3 Estimating Derivatives of Implicit Trajectories 4.3.4 Combined Dynamic Inverter and Plant . . . . . 4.3.5 An Implicit Tracking Theorem . . . . . . . . . 4.4 An Example of Implicit Tracking . . . . . . . . . . . . 4.4.1 Simulations . . . . . . . . . . . . . . . . . . . . 4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

5 Joint-Space Tracking of Workspace Trajectories in Continuous Time 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Problem Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Manipulator Tracking Control Methodologies . . . . . . . . . . . . . . . . 5.3.1 Workspace Control of Joint-space Trajectories . . . . . . . . . . . . 5.4 Joint-Space Control of Workspace Trajectories . . . . . . . . . . . . . . . 5.5 A Two-Link Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Tracking the Other Solution . . . . . . . . . . . . . . . . . . . . . . 5.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

6 Approximate Output Tracking for a Class of Nonminimum-Phase Systems 134 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.1.1 Limitations on Tracking Performance . . . . . . . . . . . . . . . . . 135

vi 6.1.2 The Inversion Problem for Nonlinear Systems . . . . . . . . . . . . . 6.1.3 How Dynamic Inversion Will Be Used . . . . . . . . . . . . . . . . . 6.1.4 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.5 Dierences in Our Approach . . . . . . . . . . . . . . . . . . . . . . 6.1.6 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.7 Chapter Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Jacobian Linearization and Regions of Attraction . . . . . . . . . . . . . . . 6.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 The Role of Jacobian Linearization in Nonlinear Control . . . . . . . 6.2.3 Dierent Controllers Same Linearization . . . . . . . . . . . . . . . 6.2.4 Regions of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 External/Internal Convertible Form . . . . . . . . . . . . . . . . . . 6.3.2 Properties of E/I Convertible Systems . . . . . . . . . . . . . . . . . 6.3.3 The Linearization at the Origin . . . . . . . . . . . . . . . . . . . . . 6.3.4 The Zero Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.5 Conversion of Control Systems to External/Internal Convertible Form 6.3.6 Balance Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.7 The Regulation and Tracking Problems . . . . . . . . . . . . . . . . 6.3.8 A Comment on Normal Form . . . . . . . . . . . . . . . . . . . . . 6.4 Controlling the External Subsystem . . . . . . . . . . . . . . . . . . . . . . 6.4.1 The External Tracking Dynamics . . . . . . . . . . . . . . . . . . . . 6.5 Controlling the Internal Subsystem . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 The Internal Tracking Dynamics . . . . . . . . . . . . . . . . . . . . 6.6 The Internal Equilibrium Manifold . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Derivatives Along the Internal Equilibrium Manifold . . . . . . . . . 6.7 Approximate Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Error Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.2 Analysis of the Internal Equilibrium Controller . . . . . . . . . . . . 6.8 Estimation of the Internal Equilibrium Angle . . . . . . . . . . . . . . . . . 6.9 Tracking for the Inverted Pendulum on a Cart . . . . . . . . . . . . . . . . 6.9.1 An Intuitive Description of the Internal Equilibrium Controller . . . 6.10 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.1 Regulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.2 Tracking Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.12 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Automatic Control of a Bicycle 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Chapter Overview . . . . . . . . . . . . . . . . 7.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Assumptions on the Model . . . . . . . . . . . 7.2.2 Reference Frames and Generalized Coordinates 7.2.3 Inputs and Generalized Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 138 138 140 141 142 143 143 143 144 144 145 146 147 150 151 153 155 157 158 159 159 161 161 163 166 167 167 167 173 175 178 181 182 194 198 199 200 200 201 201 202 202 206

vii 7.2.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Practical Simplications . . . . . . . . . . . . . . . . . . 7.3.2 Conversion to External/Internal Convertible Form . . . 7.3.3 Internal Dynamics of the Bicycle . . . . . . . . . . . . . External Tracking Controller . . . . . . . . . . . . . . . . . . . Internal Tracking Controller . . . . . . . . . . . . . . . . . . . . Internal Equilibrium Angle . . . . . . . . . . . . . . . . . . . . 7.6.1 A Dynamic Inverter for the Internal Equilibrium Angle Path Tracking with Balance . . . . . . . . . . . . . . . . . . . . Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8.1 Straight Path at Constant Speed . . . . . . . . . . . . . 7.8.2 Sinusoidal Path . . . . . . . . . . . . . . . . . . . . . . . 7.8.3 Circle at Constant Velocity . . . . . . . . . . . . . . . . 7.8.4 Figure-Eight Trajectory . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 207 210 211 213 215 215 216 217 218 218 219 223 225 227 230 231 231 233 233 234 235 235 235 235 236 236 236 237 245 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 250 250 251 252 253 254 256

7.3

7.4 7.5 7.6 7.7 7.8

7.9

8 Conclusions 8.1 Review . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Observations . . . . . . . . . . . . . . . . . . . . 8.2.1 Dynamic Time v.s. Computational Time 8.2.2 Realization of Dynamic Inverters . . . . . 8.3 Future Work . . . . . . . . . . . . . . . . . . . . 8.3.1 Methods for Producing Dynamic Inverses 8.3.2 Dierential-Algebraic Systems . . . . . . 8.3.3 Inverse Kinematics with Singularities . . . 8.3.4 Tracking Multiple Solutions . . . . . . . . 8.3.5 Tracking Optimal Solutions . . . . . . . . 8.3.6 Control System Design . . . . . . . . . . . Bibliography A Notation and Terminology B Some Useful Theorems B.1 A Comparison Theorem . . . . . . . . . . . B.2 Taylors Theorem . . . . . . . . . . . . . . . B.3 Singularly Perturbed Systems . . . . . . . . B.4 Tracking Convergence for Integrator Chains B.5 A Converse Theorem . . . . . . . . . . . . . B.6 Uniform Ultimate Boundedness . . . . . . .

C Partial Feedback Linearization of Nonlinear Control Systems

viii

List of Figures
2.1 2.2 The map F () where is the unique solution to F () = 0. The shaded region represents possible values of the function F (). . . . . . . . . . . . . There exists a line passing through ( , 0) of slope > 0 such that F () (shown gray) is above the line to the right of and below the line to the left of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Any function F (), Lipschitz in , that is transverse to the -axis at and whose values lay in the shaded regions of either of these graphs may be inverted with the dynamic system (2.13). . . . . . . . . . . . . . . . . . . . The function ( 1)3 with root = 1. No line of slope may be drawn through as in Figure 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . The function G[w ] := sign(w )|w |1/4, a dynamic inverse of F () = ( 1)3 . . The composition G[F ()] = sign(( 1)3)|( 1)3 |1/4. Now we can draw a line of slope = 1/2 (dashed) through ( , 0) = (1, 0), like the line in Figure 2.2. The dotted curve is F () = ( = 1)3 . . . . . . . . . . . . . . . . . . . . . . (y, t1 )1 provides a dynamic For any y Br(t1 ), the constant matrix D1 F inverse for F (, t) over a suciently small interval (t0 , t2 ) containing t1 . See Theorem 2.2.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The function F (, t) (2.65) for t = 0 (solid), t = 1/8 (dotted), and t = 3/8 (dashed). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The upper graph show the solutions of the dynamic inverter (2.66) for = 10 (dashed) and = 100 (solid). The initial condition was (0) = 3. The lower graph shows the estimation error for the dynamic inverter (2.66) using = 10 (dashed) and = 100 (solid). . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear circuit element of Example 2.3.4. . . . . . . . . . . . . . . . . . . The characteristic Va Vb versus f (Va Vb ) is strictly monotonic, continuous, and lies in the shaded region. A typical curve is shown. See Example 2.3.4. Circuit realization of a dynamic inverter. See Example 2.3.4. . . . . . . . . Eective characteristic (solid) of the dynamic inverter circuit of Figure 2.12. The nonlinear elements characteristic is indicated in gray. See Example 2.3.4. The top graph shows solutions of the dynamic inverter (2.89) with E (, t) = (t) for = 10 (dashed) and = 100 (solid), with the actual solution (t) (dotted). The initial condition was (0) = 3. The bottom graph shows the corresponding estimation error. . . . . . . . . . . . . . . . . . . . . . . . .

2.3

11 12 13

2.4 2.5 2.6

14

2.7

22 27

2.8 2.9

2.10 2.11 2.12 2.13 2.14

28 29 29 30 31

35

ix 2.15 The top graph shows the state trajectory (t) (solid) of the dynamic inverter (2.92), along with the solution (t) (dotted). The bottom graph shows the error norm |(t) (t)|. . . . . . . . . . . . . . . . . . . . . . . 2.16 The solution of interest in Example 2.4.7, (t) = (x (t), y (t)), is the intersection (to the right of (0, 0)) of the two cubic curves shown in each of the graphs. This gure shows the pair of cubic curves (2.115) for t {0, 1, . . ., 5}. 2.17 The solution of the dynamic inverter of Example 2.4.7 for F (, t) = 0 corresponding to Example 2.4.7, where = (x, y ). The upper graph shows x(t) versus t (solid) and y (t) versus t (dashed). The lower graph shows x(t) versus y (t) with the initial condition (x(0), y (0)) = (1, 0) marked by the small circle. 2.18 The estimation error for the dynamic inverter of Example 2.4.7 as seen through F (2.116), log10 F ((t), t) versus t in seconds. See Example 2.4.7. 2.19 The closed-loop system with dynamic inversion compensator (2.131) with state (, u) and the nonlinear plant (2.122) with state x. . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6 The matrix homotopy H (t). . . . . . . . . . . . . . . . . . . . . . . . . . . . The matrix homotopy H (t) from I to M with the corresponding solution (t), the inverse of H (t). . . . . . . . . . . . . . . . . . . . . . . . . . . . . The homotopy from I to M must remain in GL+ (n, R) to be invertible. . . Elements of A(t) (see (3.50)). See Example 3.4.3 . . . . . . . . . . . . . . . Elements of x (top), and (bottom). See Example 3.4.3. . . . . . . . . . . The error log10 ( x (t)(t) x(t) I ) indicating the extent to which x fails to satisfy x (t) x I = 0. The ripple from t 1.8 to t = 8 is due to numerical noise. See Example 3.4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . (t) is positive denite and symmetric for all t [0, 1]. . . . . . . . . . . . Elements of x(t) (top) and (t) (bottom), for Example 3.5.3. . . . . . . . . The base 10 log of the error x (t)M M T x (t) I , for Example 3.5.3. . .

37

43

45 46 48 61 62 63 73 73

3.7 3.8 3.9 4.1 4.2 4.3 4.4

74 76 78 79

Schematic of (4.11). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 The cart and ball system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Three equilibria of the cart and ball. . . . . . . . . . . . . . . . . . . . . . 89 If (y (t), y (t), y (t)) is kept suciently small for all t 0, then the ball remains in the bowl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5 Output-bounded internal dynamics. . . . . . . . . . . . . . . . . . . . . . . 90 4.6 Some cart-ball systems that do not have output-bounded internal dynamics. 91 4.7 Some cart-ball systems that do have output-bounded internal dynamics. . . 92 4.8 The zero dynamics vector eld ( ) for the zero dynamics (4.27) of Example 4.2.11. The origin of the zero dynamics is unstable, but (t) is bounded on [0, ) when | (0)| < 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.9 The closed-loop control system [C, P ]. . . . . . . . . . . . . . . . . . . . . . 95 ( r) 4.10 If ( (0), u(0)) < and (Yd (t), yd ) < with and suciently small, ( r) then convergence of ( (t), u(t)) to (Yd (t), yd (t)) preserves the upper bound on the internal state (t). . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 ( 4.11 As ( (t), u(t)) converges to ( (t), r )(t)) it must remain in B . . . . . . . 103

x 4.12 If and are not suciently small, then ( (t), u(t)) may converge exponentially to ( (t), (t)) but leave the ball B at some time. . . . . . . . . . . 4.13 If ( (0), u(0)) is in B , and (0) is in B , then (t) < for all t 0. The B is that ball in which ( (t), u(t)) must remain in order that (t) remain in B . Compare to Figure 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Top: The output y (t) (solid), as well as the implicit reference trajectory (dotted), and its estimate (t) (dashed) for the simulation of Example 4.4.1. Bottom: The output tracking error y (t) (t) . . . . . . . . . . . . . . . . 4.15 The internal state (t) for the simulation of Example 4.4.1. . . . . . . . . . 4.16 The top graph shows the estimation error (t) (t) for Example 4.4.1. The bottom graph shows the estimation error (t) (t). . . . . . . . . . . . 4.17 The top left graph shows the phase plot of 1 versus 2 . The top right graph . The lower left graph shows versus E 1 (, , t). The shows versus . The lower right graph shows the tracking error phase, 1 versus 2 symbol o marks the initial conditions for each plot. . . . . . . . . . . . . . 5.1

104

104

108 109 109

110

5.2

5.3 5.4

5.5 5.6

A sequence of poses {xd (tk )} along the workspace trajectory are inverted via an inverse-kinematics algorithm. The resulting sequence of joint-space points (t). . . . . . . . . . . . . . . . . . . . . . 117 {d (tk )} is then splined to form The black curve on the left corresponds to the desired end-eector trajectory xd (t). The black dots on the left correspond to points of xd (t) at a discrete sequence of times t1 < t2 < t3 < t4 . The black curve on the right corresponds to the inverse kinematic solution d (t) satisfying F (d(t)) = xd (t). The black dots on the right correspond to the inverse kinematic solutions d (tk ) satisfying F (d (tk )) = xd (tk ). The white curve on the right corresponds to (t) through the sequence {d (tk )}. The white a time parameterized spline (t)) curve on the left is F ((t)). Note that the error between xd (t) and F ( is non-uniform, going to zero at the sample points, and diverging from xd (t) away from the sample points. . . . . . . . . . . . . . . . . . . . . . . . . . 119 The four robot control strategies are represented each by one of the four arrows. This chapter presents a JCWT strategy, indicated by the black arrow.121 A two-link robot arm with joint angles = (1 , 2 ), joint torques = (1 , 2 ), end-eector position x, desired end-eector position xd , link lengths l1 and l2 , and link masses m1 and m2 , assumed to be point masses. . . . . . . . . 125 Two congurations corresponding to the same end eector position. . . . . 127 The top left graph shows convergence of the workspace paths: F () (solid), ) (dashed), and F ( ) (dotted) corresponding to the initial conditions of F ( Table 5.1. The top right graph shows convergence of the joint-space paths: (dashed), and (dotted). For the top graphs the symbol o marks (solid), the initial condition for each trajectory. For the two bottom graphs, the (t) (t), and upper one shows the l2 -norm of the estimation error eest = (t))]T the lower one shows the norm of the tracking error etrack = [((t), T [( (t), (t))] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

xi 5.7 The top left graph shows convergence of the workspace paths: F () (solid), ) (dashed), and F ( ) (dotted) for the other inverse kinematic solution F ( corresponding to the initial conditions of Table 5.5.1. Note that the path F ( ) is a periodic curve of period 2. The top right graph shows convergence (dashed), and (dotted) for the other of the joint-space paths: (solid), inverse kinematic solution. For the top graphs the symbol o marks the initial condition for each trajectory. For the bottom two graphs, the upper (t) (t), and the lower plot shows the l2 -norm of the estimation error eest = (t))]T bottom graph shows the l2 -norm of the tracking error etrack = [((t), T [( (t), (t))] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An irregular joint geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . A balancing cart-ball system. . . . . . . . . . . . . . . . . . . . . . . . . . . A reference trajectory yd (t) such that y d (t) 0 for all t 0, and such that T supt0 [yd (t), y d(t), y d(t)] < . For this graph t1 = 1, k = 1/(2 ). In general we assume t1 > 0 is unknown. . . . . . . . . . . . . . . . . . . . . . An external/internal convertible system. . . . . . . . . . . . . . . . . . . . . The external subsystem ext(u) of (u) (see also Figure 6.3). . . . . . . . . The internal subsystem int(x, u) of (u) (see also Figure 6.3). . . . . . . . The plant (u) reconstructed from the internal and external subsystems. . The zero dynamics of (u). . . . . . . . . . . . . . . . . . . . . . . . . . . . The interconnection of plant (u) and compensator C (v ). . . . . . . . . . . The internal tracking controller. . . . . . . . . . . . . . . . . . . . . . . . . . When f (x, ) and g (x, ) are independent of x, then e may be regarded as a time-varying function of vext. . . . . . . . . . . . . . . . . . . . . . . . . . T T The internal equilibrium controller causes the error [eT x , e ] to converge ton ward 0 exponentially until it reaches the ball Bb R . See Proposition 6.7.4. Inverted pendulum on a cart. . . . . . . . . . . . . . . . . . . . . . . . . . Regulation of the inverted pendulum. The internal equilibrium manifold E (t) is outlined in bold gray in the lower graph. The actual (x1 , x2 , 1 ) trajectory of the pendulum is indicated in black, and its projection (x1 , x2 , e ) onto E (t) is shown in gray. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 100 , 2 1 2 2 (0) = 0. Since yd 0, (e1 x , ex ) = (x , x ). . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 200 , 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 500 , 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 600 , 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 850 , 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 1, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.8 6.1 6.2

131 133 135

6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13

137 147 147 148 148 152 157 162 166 171 175

180 184 185 186 187 188 190

6.14 6.15 6.16 6.17 6.18 6.19

xii 6.20 Regulation trial for initial conditions x1 (0) = 8, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.21 Regulation trial for initial conditions x1 (0) = 16, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.22 Regulation trial for initial conditions x1 (0) = 64, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.23 Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin(0.2 t), a 0.1 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.24 Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin(0.4 t), a 0.2 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.25 Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin( t), a 0.5 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 7.2

191 192 193 195 196 197

Side view of the bicycle model with = 0. . . . . . . . . . . . . . . . . . . 202 Bicycle model rolled away from upright by angle . In this gure is negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 7.3 Leaning bicycle showing the relationship between the steering angle and the steering shaft angle . Note that in the gure, the roll-angle is negative.204 7.4 The bicycle model showing body velocities vr and v . Note that the roll-angle in the gure is negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 7.5 Top view of rear wheel showing the relationships among vr , v , x , and y . 205 7.6 Velocity geometry for constraints. . . . . . . . . . . . . . . . . . . . . . . . 207 7.7 Target path (xd , yd ) = (5t, 0)[m]. The x and y scales are in meters. The bicycles path in the plane (solid) with the desired straight path (dotted). . 220 7.8 Target path (xd , yd ) = (5t, 0) meters. The top graph shows the tracking error (x, y ) (xd , yd) 2 versus t. The second graph shows the steering angle . The third graph shows the rear wheel velocity vr (solid) with desired rearwheel velocity (dotted) vrd . The fourth graph shows the roll-angle (solid) with internal equilibrium roll-angle e (dotted). . . . . . . . . . . . . . . . 221 7.9 Internal equilibrium control causes the bicycle to steer itself so that its roll angle converges to a neighborhood of the equilibrium roll angle e , shown as a dashed line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 7.10 Sinusoidal target path (xd (t), yd (t)) = (5t, sin( 1 5 t)) [m]. The bicycles path in the plane (solid) with the desired straight path (dotted). . . . . . . . . . 223 7.11 Sinusoidal target path (xd , yd ) = (5t, 2 sin(0.2t)). The top graph shows the tracking error (x, y ) (xd, yd ) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 7.12 Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The rst 10 seconds of the bicycles path in the plane (solid) with the desired circular path (dotted). . . . . . . . . . . . . . . . . 225

xiii 7.13 Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The top graph shows the tracking error (x, y )(xd, yd) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted). . . . . . . . . . . . . . . . 226 7.14 The bicycles path in the plane (solid) with the desired gure-eight path (dotted). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 7.15 Figure-eight target trajectory. The top graph shows the tracking error (x, y ) (xd , yd) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted). . . . . . . . . . . 229

xiv

List of Tables
4.1 4.2 5.1 Initial Conditions for the implicit tracking controller simulation. . . . . . . Parameters for the implicit tracking controller simulation. . . . . . . . . . . The table on the left shows parameters for the simulation of implicit tracking control of a two-link robot arm. The table on the right shows initial conditions. All angles are in radians. . . . . . . . . . . . . . . . . . . . . . Initial Conditions for the simulation of implicit tracking control of the other solution for a two-link robot arm. All angles are in radians. . . . . . . . . . Initial conditions for regulation simulations. An asterisk * indicates that the corresponding initial conditions are in the region of attraction of the origin for the particular controller. . . . . . . . . . . . . . . . . . . . . . . . . . . Physical and gain parameters for the simulations. . . . . . . . . . Initial conditions for a straight trajectory at constant speed. . . Initial conditions for the sinusoidal trajectory at constant speed. Initial conditions for following a circular trajectory. . . . . . . . Initial conditions for following the gure-eight trajectory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 107

128 130

5.2 6.1

182 219 219 223 225 227

7.1 7.2 7.3 7.4 7.5

xv

Acknowledgements
Many people have given me help, guidance, freedom, access, and inspiration in my years in graduate school. To all of them I am very grateful. I wish to thank Genevi` eve Thi ebaut, Heather Levien, Mary Byrnes, Carol Block, Chris Colbert, Susan DeVries, Tito Gatchalian, Diane Hsuing, and Je Wilkinson for their advice and good-natured support in guiding me through various bureaucratic hurdles of graduate school. I am grateful to my friends Adam Schwartz and Shahram Shahruz for their encouragement and suggestions on my work, and to Ed Nicolson and Steve Burgett for making themselves available for my many questions about LaTeX. Thanks to Max Holm who, through his dedication and perseverance, has kept the robotics lab computer network running. Thanks too to other friends and oce-mates, past and present, who have given me the pleasure of their company and companionship. I thank members of my qualifying exam committee Professors Seth Sanders, Hami Kazerooni, Ron Fearing who, through their questions, suggestions, and interest gave me that sense of validation that a graduate student needs to cross the threshold from the classical into the unknown. I thank Professors John Wawrzynek, Leon Chua, Shankar Sastry, Alan Lichtenberg, Max Mendel, Karl Hedrick, and Philip Stark, professors who gave me the freedom to chase after my ideas in various independent-study courses over the years. Thanks too to Professor John Canny for allowing me access to the robotics lab computers. I thank NASA and NSF for funding during Fall 1992 and Summer 1993. I also wish to thank NSF for travel awards which allowed me to travel to conferences in Nagoya and New Orleans. To my parents I am grateful for many things. They have been encouraging and patient throughout my re-education from artist to engineer, tolerating my neglect with an understanding, from their own experience, of the level of work necessary to achieve most things of value. I have seen little of them since I have been in graduate school. I hope to see much more of them in the future. The most fortunate incident in my graduate school years has been my acquaintance with my fellow engineer Dr. Elise Burmeister. For her companionship, understanding, tolerance, patience, patience, and patience I cannot express my gratitude. For many reasons, I do not believe that I would have survived graduate school to graduation without her tenderness, comfort, and good humor. To her I dedicate this dissertation.

xvi To my readers, I am especially grateful: Professor Andrew Packard has supplied me with encouragement and good will at a time when I needed both. The depth of his understanding of control theory is an inspiration that I will carry with me into my professional years. I look forward to his friendship in the years to come. My research advisor, Professor Jerrold Marsden, has given me encouragement, guidance, and most importantly his interest in my ideas. There are a few electrifying moments in ones research career where ideas snap together, fusing in a ash of revelation, understanding walking calmly out of the light. These are the moments for which researchers struggle and live. With Jerry as my advisor, I have had more than my share of such moments. In an environment where familiarity is too often passed o as understanding, I have known of no one with a deeper sense of what it means to understand something than he. Where others are ready to move on, thinking that all has been seen and assimilated, Jerry says Wait! Theres more. There always is more much more, and without Jerry we might have missed it. I look forward to many more years of collaboration with him as friend, colleague, and mentor. Finally, I wish to express my profound gratitude to Professor Charles Desoer for countless hours of careful reading, thoughtful suggestions, encouragement, and relentless, though always constructive criticism. Since I rst began to study the problem of controlling a bicycle back in the late spring of 1992, he has been my constant guide. There is a fearless spirit in great scientists, an uncompromising determination to stand their ground in the face of ferocious Complexity. I have had the privilege of becoming familiar with that spirit in hours of consultation with Professor Desoer, and the thrill of seeing Complexity in retreat. When I asked Professor Desoer to be on my dissertation committee, it was because I knew of no better motivation for careful thought and exposition than knowing that he would be reading and criticizing the product of my eorts. This was the best decision I have made in my graduate career. By both example and instruction he has been my most inuential guiding force, teaching me how to think and live like an engineer.

Chapter 1

Introduction
1.1 Motivation
Nonlinear equations arise frequently in nonlinear control as well as many other areas of applied mathematics1. They may appear as constraints on a dynamic system, or as equations whose time-varying roots may be important reference signals upon which a control system relies. Numerical estimates of both xed and time-varying roots can, however, be problematic. Even for R1 R1 maps, Newtons method2 is known to fail, for instance, near local minima and maxima. Newtons method is not even applicable for F () must be known; this is not always convenient. Secant and regula falsi methods skirt the need for F () by estimating F () using values of F (i ) for successive iterations i , but they too fail when local maxima and minima are encountered. The bisection method is a very general and robust method for approximating roots, but it does not generalize to solving for roots is Newton-Raphson. multiple dimensions. In fact, for Rn Rn maps, virtually the only method available for In applying discrete inversion routines in the context of the control of dynamic systems we are faced with another problem. Numerical techniques for the solution of linear and nonlinear equations are usually performed in discrete-time, a sequence of intermediate computations converging toward a solution. This can be a disadvantage in that the employment of such methods, in the context of the control of continuous-time dynamic systems,
1 It is common practice to include linear objects in the set of nonlinear objects. This convention will be obeyed throughout this dissertation. Thus, a linear map is also a nonlinear map, but a nonlinear map is not, in general, linear. 2 See [GMW81] for a review of the various numerical methods mentioned here.

nondierentiable functions, and when applied to dierentiable functions, the dierential

Introduction

Chap. 1

necessitates a discrete-time approach to the combination of computation and control, often making implementations and proofs tedious and dicult. In the context of the control of nonlinear systems, the problem of inversion arises when one wishes to control the output of a control system to track a desired trajectory. One must then invert the control system in order to obtain a state trajectory and control which will produce the desired output. For nonminimum-phase systems, i.e. systems with unstable zero dynamics (see Chapter 4, Denition 4.2.5), such inversion presents some dicult and fundamental problems. Methods exist [DPC94, HMS94] for computing exact and approximate inverse solutions. However, in the nonminimum-phase case the resulting inverse trajectory will not, in general, have an initial condition that corresponds to the initial condition of the control system. One must have complete knowledge of the desired output trajectory and have the freedom to preset initial conditions [BL93] of the control system in order to achieve exact output tracking. Such knowledge is often unavailable, and even when available, presetting of initial conditions is usually not practical. This dissertation has been motivated by the problem of causal inversion of nonlinear nonminimum-phase systems. Indeed, a concrete motivational problem has been the design of a controller for a simple mathematical model of an autonomous bicycle, where we wish to make the bicycle follow a time-parameterized path in the ground plane without falling over. Knowing that we can stabilize any smooth roll-angle trajectory and rear-wheel velocity, how can we choose the roll-angle trajectory and rear-wheel velocity to produce the desired tracking behavior in the plane. As a practical matter we assume that we can only count on knowing a reference trajectory and its derivatives at the present time, not the entire future of the trajectory. This immediately rules out the utility of presetting initial conditions. For nonminimum-phase systems, it also rules out the possibility of exact tracking of reference trajectories drawn from an open set [GBLL94]. Consequently we will sacrice some exactness of output tracking in order to construct a controller that provides approximate inversion with bounded internal state.

1.2

Dynamic Inversion
In this dissertation a continuous-time dynamic method for approximating solu-

tions (t) to nonlinear equations of the form F (, t) = 0 will be presented. We call this method and its resulting computational paradigms dynamic inversion. We will associate = (, t) with the crucial property that an arbitrarily with F (, t) a dynamic system

Sec. 1.3

Contributions of this Dissertation

small neighborhood of is exponentially attractive. There will be two cases; one where is attracting, and one where a region about is attracting. We will rely upon the notion that any dynamic system having a stable equilibrium may be regarded as a representation of an analog computational architecture for the solving of an equilibrium equation. For the special case that F (, t) = F () we will see that the method presented in this dissertation is considerably more general than the discrete-time iterative methods mentioned above. Dierentiability of F () is not required, though convenient when available. Our method is dissuaded by neither local minima nor local maxima. It generalizes easily problems mentioned in Section 1.1 above. to Rn Rn maps. We will apply dynamic inversion in order to provide solutions for the

1.3

Contributions of this Dissertation


The main new results of this dissertation are:

A methodology for the construction of continuous-time dynamic systems that solve inverse problems having nite-dimensional time-varying solutions. A geometric approach to approximate output tracking for a class of nonlinear nonminimumphase systems. A theorem on the aect of ane disturbances on exponentially stable dynamic systems. A useful nonlinear characterization of internal dynamics for nonlinear systems which extends notions of internal stability beyond the usual characterization of stable or unstable zero dynamics. Application of these results has resulted in further contributions of this dissertation: Nonlinear dynamic systems that solve for inverses and polar decompositions of xed and time-varying matrices. A class of tracking controllers which allow nonlinear control systems to track implicitly dened trajectories. A control methodology for robotic manipulators which allows tracking of workspace trajectories using gains and errors posed in joint-space.

Introduction

Chap. 1

A tracking controller for balancing two-wheeled vehicles such as bicycles and motorcycles.

1.4

Overview of the Thesis


This dissertation presents a nonlinear dynamic framework for solving a class of

inverse problems and applies this frame work to a variety of problems that arise in nonlinear control. The dissertation is organized as follows: Chapter 2. Dynamic Inversion of Nonlinear Maps. In this chapter the notion of

a dynamic inverse of a map is introduced. Given an inverse problem, where the inverse solution is posed as the root of a time-dependent map, the dynamic inverse of the map is combined with the map to produce a nonlinear dynamic system whose solution asymptotically approximates the root. We present properties of the class of dynamic inverses of a map which allow coupled inverse problems, problems whose solutions depend on each other, to be combined into a single dynamic system that produces all of the coupled solutions. By posing the dynamic inverse itself as the solution to an inverse problem, we show how both the dynamic inverse and the solution to the inverse problem of interest may be solved simultaneously.

Chapter 3. Dynamic Inversion and Polar Decomposition of Matrices. In

this chapter we present dynamic methods for the inversion and polar decomposition of xed and time-varying matrices. Four main results are presented. First we show how a time-varying matrix inverse may be tracked given a good initial guess at the inverse at an initial time. In the second result, we show how a xed matrix may be dynamically inverter given a good initial guess at its inverse and how, for special classes of matrices, the initial guess need not be close to the solution. We then show how dynamic inversion may be applied in order to produce the polar decomposition, as well as the inverse, of a time-varying matrix. This leads to similar results for xed matrices, though in the case of xed matrices we show how inversion and polar decomposition may be achieved in nite time rather than asymptotically.

Chapter 4. Tracking Implicit Trajectories. Output tracking control for systems reference trajectory to be tracked is posed explicitly. When the trajectory is posed

having relative degree and stable zero dynamics is a well understood problem when the

Sec. 1.4

Overview of the Thesis

implicitly, however, no such tracking control methodology exists. Here we present such a methodology, combining dynamic inversion methods of Chapter 2 with more conventional tracking control to produce a dynamic controller for tracking implicit trajectories. We also introduce the concept of output-bounded internal dynamics which is a nonlinear extension of the more common notion of internal stability usually applied to nonlinear systems and inherited from linear systems. We show that application of the implicit tracking controller preserves output-bounded zero dynamics for tracking of an open set of reference outputs. Chapter 5. Joint-Space Tracking of Workspace Trajectories in Continuous Time. In this chapter we consider the problem of tracking control of robotic manipulators. Given a desired end-eector reference trajectory the objective is to apply joint forces and torques so that the path of the end-eector of the manipulator converges to the desired reference trajectory. Two standard approaches are rst examined. In one a discrete inverse kinematics algorithm is applied to points along the reference trajectory to create a sequence of points in joint space. These joint-space points are then splined together and a standard tracking controller controls the joint torques in such a way that the splined joint-space trajectory is followed. In a second method, through dierentiation of the forward kinematics map, the dynamic equations of the robotic manipulator are transformed into workspace coordinates. Tracking control is then posed directly in the workspace. In contrast, using results on implicit tracking from Chapter 4, we present a controller which allows continuous time inversion of the forward kinematics so that gains and errors for tracking control may be posed in the joint space. Chapter 6. Approximate Output Tracking for a Class of NonminimumPhase Systems. For a signicant class of nonlinear control systems, a control which systems exact tracking of output reference trajectories drawn from an open set is not possible without the ability to preset initial conditions if one also wishes to maintain bounded state trajectories. This chapter oers an approach to tracking control which trades o some accuracy of tracking for internal boundedness and stability. The complete history of the output reference trajectory is not assumed known in advance. An internal equilibrium manifold is dened. It is a submanifold of state space with the special property that if the state of the system is near that manifold, then the output holds the output to be identically zero results in unstable internal dynamics. For such

Introduction

Chap. 1

approximately tracks the output reference trajectory. A controller is presented which causes a neighborhood of the manifold to become attractive and invariant. Thus if the manifold is bounded, then the state is bounded and approximate output tracking is achieved. The internal equilibrium controller is applied to the tracking control of the classical problem of the inverted pendulum on a cart. Comparison to a linear quadratic regulator shows a signicant increase in performance. Dynamic inversion is incorporated into the controller to provide a signal used to track the location of the internal equilibrium manifold. Chapter 7. Automatic Control of a Bicycle. Based on the results of Chapter 6, nonholonomic nonminimum-phase model of a bicycle or motorcycle. A simple model of a bicycle is presented. Through nonholonomic reduction and manipulation of the rolling constraints, equations of motion amenable to the techniques of Chapter 6 are obtained. Simulation results verify the theory of Chapter 6 while the bicycle tracks a straight line, a sinusoid, a circle, and a gure-eight. Chapter 8. Conclusions. The main results of the dissertation are summarized and a number of ideas and problems for future work are presented.

an internal equilibrium controller is constructed for the tracking control of a nonlinear

In a number of appendices we include some reference material for the reader including Appendix A: notation and denitions. Appendix B: a number of useful theorems drawn from outside sources. Appendix C: a review of the subject of feedback linearization of nonlinear control systems. The results contained in Appendices B and C will be pointed to when needed.

Chapter 2

Dynamic Inversion of Nonlinear Maps


2.1 Introduction
In this chapter we describe a continuous-time dynamic methodology for inverting nonlinear maps. We call this methodology dynamic inversion. Given a map1 F : Rn R+ Rn known, by some means, to have a continuous isolated solution (t) to F (, t) = 0, we The map G[w, ] is characterized by the property that the dynamic system = G[F (, t), , t] has a solution (t) which converges asymptotically to the solution (t). (2.1) associate with F (, t) another map G[w, , t] which we call a dynamic inverse of F (, t).

2.1.1

An Informal Introduction to Dynamic Inversion


Dynamic inversion is most easily introduced2 by rst considering the problem of

nding the root of a real-valued function on the real line. that we do not know the solution to F () = 0, but that we would like to nd it using form expression, a combination of table-lookup and interpolation, a physical (non-dynamic) system with an input and an output F (), or any combination of the above. Assume that
1 2

A. Consider the function F : R R; F () illustrated in Figure 2.1 Assume

a representation of F (). The representation of F () may be, e.g. in the form of a closed-

We will use the terms map, mapping, and function interchangeably. The precise hypotheses will be developed starting in Section 2.2 below.

Dynamic Inversion of Nonlinear Maps

Chap. 2

F()

[ a

] b

Figure 2.1: The map F () where is the unique solution to F () = 0. The shaded region represents possible values of the function F (). we know that a unique solution exists in the interval [a, b] R. The function F () of Figure 2.1 has a number of features which limit the choices of techniques that may be used to nd its root, . It is, in places, not dierentiable. It also has minima and maxima at regions of [a, b] as indicated by the shaded region of the graph3 . We will assume, however, secant method, regula falsi) would fail to nd the root of this function if the initial guess at the root is not close to . However, we make the following claim: Claim 2.1.1 For any initial value 0 [a, b], the solution (t) to the dynamic system = F () converges to the root as t . Informal Proof of Claim 2.1.1: Consider a solution of (2.2). Assume that F () is such that a solution (t) of (2.2) exists for any (0) [a, b]. If (0) = , then F ((0)) = 0,
3

points other than . We may even be uncertain about the value of F () for in certain

that F () is Lipschitz continuous on [a, b]. Clearly Newtons method and its variants (e.g.

(2.2)

We assume that there exists some k > 0 such that F () k < 0 in the region of Figure 2.1 marked by

?.

Sec. 2.1

Introduction

[ a * ] b

Figure 2.2: There exists a line passing through ( , 0) of slope > 0 such that F () (shown gray) is above the line to the right of and below the line to the left of . so (2.2) works ne for this case. If (0) [a, ], then the vector eld F () pushes the state to the right, towards . As long as F ((t)) < 0 this will continue to be so. Since F () < 0 for all [a, ], the solution (t) will ow to as t . Likewise, if (0) [ , b], then F () pushes the solution (t) left to as t . The argument above suggests that for maps similar to F () in Figure 2.1, i.e. maps

whose values are strictly above the abscissa to the right of and strictly below the abscissa however. to the left of , (t) asymptotically as t . We will make an additional claim,

Claim 2.1.2 The convergence where (t) is the solution of (2.2) is in fact expot > 0,

nential, that is, there exists a k1 and a k2 in R, 0 < ki < , i {1, 2} such that for all (t) k1 |(0) |ek2 t (2.3)

The important feature of F () in Figure 2.1 which allows us to make Claim 2.1.2 is illustrated in Figure 2.2. Note that to the right of the root , the graph of F (, t) is above a line of slope passing through ( , 0), and to the left of , the graph of F (, t) is

10

Dynamic Inversion of Nonlinear Maps

Chap. 2

say that for all z [a , b ],

below the same line. An equivalent expression of this feature is to dene z := and

zF (z + ) z 2

(2.4)

1 2 2 Informal Proof of Claim 2.1.2: Let V () := 1 2 ( ) = 2 z . Dierentiate V () with

respect to t to get

d = zF () V () = ( ) dt But from (2.4) we have zF () = zF (z + ) z 2 Note that 1 z 2 = 2 z 2 = 2V () 2 Therefore V () V ((0))e2t Insert the denition of V () into (2.8) to get 1 1 ((t) )2 ((0) )2 e2t 2 2

(2.5)

(2.6)

(2.7)

(2.8)

(2.9)

Multiply (2.9) by 2 and take the positive branch of the square-root of both sides of the resulting equation to get |(t) | |(0) |et, t 0 which proves the claimed exponential convergence. We call the dynamic system (2.2) a dynamic inverter for since it solves F ( ) = 0 for . B. Let sign(a) be dened by sign(a) = If we replace (2.4) by z sign(F (b) F (a)) F (z + ) z 2 (2.12) 1, if a > 0 1, if a < 0 (2.11) (2.10)

Sec. 2.1

Introduction

11

[ a

b ]

a [

* ] b

Figure 2.3: Any function F (), Lipschitz in , that is transverse to the -axis at and whose values lay in the shaded regions of either of these graphs may be inverted with the dynamic system (2.13). then the dynamic inverter

= sign(F (b) F (a)) F ()

(2.13)

will suce for inversion of functions of same form as F () in Figure 2.1 or for functions such as F (). Consider Figure 2.3. Any Lipschitz continuous function F () which is transverse

to the -axis at and whose values lie in the gray regions of the gure will be dynamically inverted by (2.13), as long as F () is such that a solution of (2.13) exists for all (0) [a, b]. The proof is similar to the proof above, the essential step in the proof coming from the inequality (2.12). C. Now suppose that we encounter a function F () := ( c)3 (2.14)

which is graphed in Figure 2.4 for c = 1. This function has a well-dened root = c, but there does not exist a > 0 such that (2.12) holds, i.e. there is no line of constant slope > 0 such that F () ts in either picture of Figure 2.3. However, consider the following observation: Observation 2.1.3 If G : R R; w G[w ], is such that G[w ] = 0 = w = 0 (2.15)

12
1 0.8 0.6 0.4 0.2

Dynamic Inversion of Nonlinear Maps

Chap. 2

F( )

-0.2 -0.4 -0.6 -0.8 -1

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Figure 2.4: The function ( 1)3 with root = 1. No line of slope may be drawn through as in Figure 2.2. then {G[w ] = 0 and F ( ) = 0} = G[F ( )] = 0 In other words, if is a root of F (), then is also a root of G[F ()]. Observation 2.1.3 aords us the freedom to generalize our dynamic root solving c which satises (2.12) for any (0, 1]. So for dynamic inversion of (2.14) we could use = G[F ()] (2.17) method to functions such as (2.14). For instance, let G[w ] = sign(w )|w |1/3. Then G[F ()] = (2.16)

Note that neither F () nor G[w ] need be Lipschitz continuous, but if G[F ()] is Lipschitz in , then a unique solution (t) is guaranteed to exist. We could also have used G[w ] := sign(w )|w |1/4, shown in Figure 2.5. The composition G[F ()] is shown in Figure 2.6 along with an appropriate line of slope = 1/2. In fact there are an innite number of functions G[w ] which satisfy zG[F (z + )] z 2 (2.18)

for some > 0. We call such a function G[w ] a dynamic inverse of F () since, in the context of the dynamic system (2.17), G[w ] solves F () = 0 with exponential convergence

Sec. 2.1
1 0.8 0.6 0.4 0.2

Introduction

13

G(w)

-0.2 -0.4 -0.6 -0.8 -1 -1

-0.8

-0.6

-0.4

-0.2

0.2

0.4

0.6

0.8

Figure 2.5: The function G[w ] := sign(w )|w |1/4, a dynamic inverse of F () = ( 1)3 . of (t) .

D. The bisection method (see, for instance, [GMW81], page 84) could also be used

to solve for the root of F (), but the bisection method relies upon the fact that divides any continuous interval containing into two connected sub-intervals, one in which F () is positive, and the other in which F () is negative. The bisection method is dened only for real-valued functions of one variable. On the other hand dynamic inversion, including the criterion (2.18), generalizes easily to maps F : Rn Rn as well as maps F (, t) which depend on time.

2.1.2

Previous Work
Continuous-time dynamic methods of solving inverse problems have been around

a long time. Indeed, if x (t) is an isolated asymptotically stable equilibrium solution4 of x = (x, t), then x = (x, t) can be regarded as a dynamic inverter for solving (x, t) = 0 (2.19)

In the areas of adaptive control [SB89] and optimal control [AM90, BH69] dynamical systems have been used to solve for unknown parameters of physical systems. Most
4

By an equilibrium solution of x = (x, t) we mean a solution x (t) that satises (x (t), t) = 0.

14
1 0.8 0.6 0.4 0.2

Dynamic Inversion of Nonlinear Maps

Chap. 2

G[F()]

-0.2 -0.4 -0.6 -0.8 -1

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Figure 2.6: The composition G[F ()] = sign(( 1)3 )|( 1)3 |1/4. Now we can draw a line of slope = 1/2 (dashed) through ( , 0) = (1, 0), like the line in Figure 2.2. The dotted curve is F () = ( = 1)3 . results are for linear systems whose parameters are assumed to be slowly varying. Such results may often be used also for nonlinear systems that can be converted to linear systems through state- dependent coordinate transformations [SB89]. With the recent vogue of designing dynamic systems and circuits thought to mimic certain models of computation in the nervous system [Mea89, CR93], there has been a renewed interest in viewing dynamics as computation. Gradient ows, in particular, have been heavily relied upon in the neural network literature [JLS88]. More recently Brockett [Bro91, Bro89] has shown how continuous-time dynamical systems may be used to sort lists and solve linear programming problems. Bloch [Blo85, Blo90] has shown how Hamiltonian systems may be used to solve principal component and linear programming problems. Helmke and Moore in [HM94] review a broad variety of inverse and optimization problems solvable by continuous-time dynamical systems. The inverse-kinematics problem in which one wishes to solve for a (t) satisfying F () xd (t) = 0 (2.20)

has given rise to a number of continuous-time dynamic methods [WE84, TD93, NTV91a, NTV91b, NTV94, Tor90b, Tor90a] of solving inverse problems of the form (2.20). The

Sec. 2.1

Introduction

15

notion of a dynamic inverse of a nonlinear map, introduced here, generalizes the role of dynamic inversion itself to determine a dynamic inverse, while simultaneously using that dynamic inverse to solve for a time-varying root of interest. Also, we have developed dynamic inversion around the inversion of maps of the form F (, t) which is considerably more general than F () xd (t). Dierentiability of F (, t) is not required though it is useful when available. In Chapter 5 we will apply our methods to the inverse-kinematics problem, in particular to the problem of controlling a robotic arm to track an inverse kinematic solution. In the present chapter, however, we will give some example applications (see for instance Examples 2.4.7 and 2.4.8) of dynamic inversion which are not of a form amenable to prior techniques of using continuous-time dynamics for their solutions. DF ()1 w and DF ()T w in those methods. In fact we will see that we may use

2.1.3

Main Results
The main results of this chapter are as follows:

i. We dene a dynamic inverse for nonlinear maps. ii. Using a dynamic inverse we construct a dynamic system that yields an estimate of the root (t) of F (, t) = 0, and we prove that the estimation error is bounded as t . iii. We construct a derivative estimator for the root (t), and incorporate that estimator into a dynamic system which estimates (t) with vanishing error as t . iv. We construct a dynamic system that dynamically solves for a dynamic inverse as the solution to an inverse problem, while simultaneously using that dynamic inverse to produce an estimator for a root (t), where the estimation error is vanishing.

2.1.4

Chapter Overview
In Section 2.2 we will introduce the formal denition of a dynamic inverse of a

map. Then in Section 2.3 we will use the dynamic inverse to derive a continuous-time dynamic estimator for time-varying vector-valued roots of nonlinear time-dependent maps. We will prove two theorems which assert that the resulting estimation error may be made arbitrarily small within an arbitrarily short period of time by the adjustment of a single scalar gain. With one theorem we will assert that the estimation error becomes arbitrarily small in nite time; with the other theorem we will assert exponential convergence of an

16

Dynamic Inversion of Nonlinear Maps

Chap. 2

itself may be determined dynamically, that is we will pose both the dynamic inverse itself and the root we seek as the solution to an equation of the form F (, t) = 0. In Section 2.5 we will discuss ways in which dynamic inversion may be generalized to cover a broader set of problems. A number of examples will illustrate application of dynamic inversion in cases where closed-form solutions are readily available, allowing the reader to verify the theory

estimator to the root as t . We will then show in Section 2.4 how a dynamic inverse

and operation of dynamic inversion. In Example 2.4.7, however, we apply dynamic inversion to solve for the intersection of two time-varying polynomials, a problem whose quasiperiodic solution is not so readily available in a closed form. Example 2.4.8 shows, in a more abstract context, how dynamic inversion may be used to construct a dynamic controller for a nonlinear control system.

2.2

A Dynamic Inverse
We begin by dening the dynamic inverse, a denition central to the development

of the methodology presented in this chapter. The dynamic inverse is dened in terms of the unknown root of a map. Later we will show that a dynamic inverse may be obtained without rst knowing the root. Denition 2.2.1 For F : Rn R+ Rn ; (, t) F (, t) let (t) be a continuous isolated dynamic inverse of F on the ball Br := {z Rn | z r }, r > 0, if i. G[0, z + (t), t] = 0 for all t 0 and z Br , ii. the map G[F (, t), , t] is Lipschitz in , piecewise-continuous in t, and iii. there is a real constant , with 0 < < , such that Dynamic Inverse Criterion z T G [F (z + (t), t) , z + (t), t] z
2 2

solution of F (, t) = 0. A map G : Rn Rn R+ Rn ; (w, , t) G[w, , t] is called a

(2.21)

for all z Br .

Sec. 2.2

A Dynamic Inverse

17

In order to emphasize the association of G with a particular solution (t) of F (, t) = 0, we will sometimes say that G is a dynamic inverse of F (, t) with respect to the solution (t). In order to emphasize the association of a particular parameter with G, we will sometimes say that G is a dynamic inverse with parameter . We will also, at times, restrict the domain of t to some subset of R+ . Note that the denition of dynamic inverse does not involve dynamics, though its signicance will be in the dynamic context of a dynamic inverter. Some easily veried properties of the dynamic inverse that will prove useful are the following: Property 2.2.2 Positive Scalar Times Dynamic Inverse. If G[w, , t] is a dynamic inverse of F (, t) with parameter , then for any > 0 R, G[w, , t] is a dynamic inverse of F (, t) with parameter . Property 2.2.3 Many s for Each Dynamic Inverse. If G[w, , t] is a dynamic inverse of F (, t) with parameter 1 , then for any 2 such that 0 < 2 1 , G[w, , t] is a dynamic inverse of F (, t) with parameter 2 . Property 2.2.4 Stacking Decoupled Dynamic Inverses. Assume that G1 (w1 , 1 , t) is a dynamic inverse of F1 (1 , t) with parameter 1 , and G2 (w2 , 2 , t) is a dynamic inverse of F2 (2 , t) with parameter 2 . Let w = (w1 , w2) and = (1 , 2 ). Let G and F be dened by G[w, , t] := G1 (w1 , 1 , t) G2 (w2 , 2 , t) , F (, t) := F1 (1 , t) F2 (2 , t) (2.22)

Then G is a dynamic inverse of F with parameter = min{1 , 2 }. (z, t) := F (z + (t), t) and G [z, , t] := Property 2.2.5 Dynamic Inverse at Zero. Let F G[w, z + (t), t]. Then G[w, , t] is a dynamic inverse of F (, t) relative to a solution (t) [w, z, t] is a dynamic inverse of F (z, t) relative to z = 0. if and only if G Property 2.2.6 Trivial Dynamic Inverse If G1 (w, , t) is a dynamic inverse of F1 (, t), then G2 (w ) = w is a dynamic inverse of G1 (F1 (, t), , t). When G[w ] = kw where k R, 0 < k < , then G[w ] is called a trivial dynamic inverse.

18

Dynamic Inversion of Nonlinear Maps

Chap. 2

It will be proven in the next section that if F (, t) has a dynamic inverse G[w, , t], then for all initial conditions (0) in an open neighborhood of (0), the integral curves of the vector eld G[F (, t), , t] converge exponentially to a neighborhood of (t) as t . For the case of a scalar valued F (, t) we have the following lemma:

F (, t) be C 2 in and continuous in t for all in an interval [a, b]. Let (t) be a continuous isolated solution of F (, t) = 0. Assume that there exists an r > 0 and a > 0 such that ( )sign (F (b, t) F (a, t))F (, t) ( )2 for all ( ) Br and all t R+ . Then Dynamic Inverse for Scalar Functions G[w ] := sign D1 F ( (0), 0) w

Lemma 2.2.7 Dynamic Inverse for Scalar Functions. Let F : R R+ R; (, t)

(2.23)

(2.24)

is a constant dynamic inverse of F (, t). Proof of Lemma 2.2.7: Since F (, t) is C 1 in , D1 F (, t) is well-dened and continuous in and t. Since F (, t) is continuous and satises (2.23) for all t R+ , sign(F (b, t) F (a, t)) is well-dened and constant for all t, and, furthermore, sign Therefore sign D1 F ( (t), t) = sign D1 F ( (0), 0) (2.26) D1 F ( (t), t)

= sign(F (b, t) F (a, t))

(2.25)

Thus the sign of D1 F ( (t), t) is an invariant of the isolated solution (t). Now from (2.23) and (2.25) we have ( )sign D1 F ( (0), 0) F (, t) ( )2 (2.27)

so G[w ](2.24) is a constant dynamic inverse for F (, t).

Sec. 2.2

A Dynamic Inverse

19

Remark 2.2.8 Lemma 2.2.7 tells us that for time- varying scalar valued C 1 functions, we need only pick a sign to produce a dynamic inverse. Typically one knows an interval [a, b] that brackets the solution. Then one need only evaluate F (a, t1 ) and F (b, t2 ) for any times t1 0 and t2 0 to determine a dynamic inverse. Dynamic inverses for ane maps are easily obtained as illustrated by the following example. Example 2.2.9 Dynamic Inverse for Ane Maps. Let F (, t) = A( u(t)) (2.28)

G[w, , t] = B w is a dynamic inverse of F . The solution of F (, t) = 0 is (t) = u(t). It is clear that z T G [F (z + u(t), , t) , t] = z T B (Az ) min(BA) z
2 2

where A Rnn . Then for any matrix B Rnn such that BA is positive denite,

(2.29)

where min(BA) is the smallest singular-value of BA. Note that if A is singular, then F given by (2.28) has no dynamic inverse. If A is non-singular, a possible choice of B is AT . We will have occasion in Section 2.4 below to choose a dynamic inverse for one inverse problem to depend on the solution to a dierent, but related inverse problem. In such cases the combination of the two inverse problems may be viewed as a single inverse problem through the following property of dynamic inverses. Property 2.2.10 Stacking Coupled Dynamic Inverses. Assume that G1 (w 1 ; 1 , 2 ; t)
2 (t) for all 1 Br2 , and G2 (w 2 ; 1 , 2 ; t) is a dynamic inverse of F 2 (1 , 2 , t) with respect to 1 (t) for all 2 such that (2 2 (t)) is a dynamic inverse of F 1 (1 , 2 , t), with respect to

1 such that (1 (t)) Br1 . Let := (1 , 2 ) and w = (w 1 , w 2 ). Then

G[w, , t] := is a dynamic inverse of F (, t) :=

G1 (w 1 ; 1 , 2 ; t) G2 (w 2 ; 1 , 2 ; t) F 1 (1 , 2 , t) F 2 (1 , 2 , t)

(2.30)

(2.31)

1 (t), 2 (t)) for all ( , ) such that (1 1 , 2 2 ) B B . with respect to ( 1 2 r1 r2

20

Dynamic Inversion of Nonlinear Maps

Chap. 2

Sucient conditions on F (, t) under which a dynamic inverse exists are mild. They are given in the following existence lemma. Lemma 2.2.11 Sucient Conditions for Existence of a Dynamic Inverse. For F : Rn R+ Rn ; (, t) F (, t), let (t) be a continuous isolated solution of F (, t) = 0. Let F (, t) be C 2 in and continuous in t. Assume that the following are true: i. D1 F ( (t), t) is nonsingular for all t; ii. D1 F ( (t), t) and D1 F ( (t), t)1 are bounded uniformly in t;
2 F (z + (t), t) is bounded uniformly in t. iii. for all z Br , D1

Under these conditions there exists an r > 0 independent of t, and a function G : Rn Rn R+ Rn , (w, , t) G[w, , t] such that for each t > 0 and for all satisfying (t) Br , G[w, , t] is a dynamic inverse of F (, t). Proof of Lemma 2.2.11: Let (z, t) := F (z + , t) F (2.32)

Theorem 2.5.7, page 121), for each t R+ there exists an open neighborhood Nt Rn of 1 : Rn R+ Rn ; (w, t) F 1 [w, t] such that for all z Nt , the origin, and a function F 1 F (z, t), t = z F Let 1 (w, t) G[w, , t] := F If there exists an r > 0 such that Br Nt t R+ then for all z Br z T G [F (z + (t), t) , , t] = z T z = z
2 2

Since D1 F ( , t) is invertible for all t R+ , by the inverse function theorem (see [AMR88],

(2.33)

(2.34)

(2.35)

(2.36)

and we may choose G as a dynamic inverse with satisfying 0 < 1. In the absence of items ii and iii of the hypothesis, there is the possibility that no such r exists, e.g. the

Sec. 2.2

A Dynamic Inverse

21

largest ball contained in Nt may be B0 in the limit as t . Assurance that an r > 0 exists is provided by a proposition of Abraham, et al. [AMR88] (Proposition 2.5.6, page 119) (z, t) = 0 is solvable. Though that proposition gives regarding size of the ball on which F explicit bounds on r based on the explicit uniform bounds on D1 F ( (t), t), D1 F ( (t), t)1 ,
2 and D1 F (z + (t), t), for our purposes it is enough to know that the existence of such

uniform bounds is sucient to guarantee the existence of an r > 0. Though Lemma 2.2.11 requires F (, t) to be C 2 in at = (t), this is only a sucient condition for the existence of a dynamic inverse. That it is not necessary is indicated by the next example. Example 2.2.12 Consider the piecewise-linear time-varying function F (, t) = ( u(t)), u(t) 0 (2.37)

1 2 ( u(t)), u(t) < 0

G[w, , t] = G[w ] = w . Then

where u(t) is a continuous function of t. The solution to F (, t) = 0 is (t) = u(t). Let5 z T G [F (z + (t))] = z T F (z + u(t)) =
1 2

z T (1/2) z < 0
2 2

z T (1),

z0

(2.38)

is as dened in (2.32), so that 0 < 1/2. But F (, t) is not dierentiable at where F = (t). Using the exact inverse of F as a dynamic inverse as in the proof of Lemma 2.2.11 is not very practical since the exact inverse, though always a dynamic inverse, is normally not known. There is reason for hope, however, in the observation that the criterion that G be a dynamic inverse of F is considerably weaker than the criterion that G be an inverse of F in the usual sense. One might guess that a truncated Taylor expansion for F 1 would be a good candidate for G. That this guess is true is veried in the proof of the following theorem.
Throughout we will use the abuse of notation demonstrated by referring to G[w, , t] as G[w ] when the value of G depends only on w .
5

22

Dynamic Inversion of Nonlinear Maps

Chap. 2

bounded. Let t1 > 0 be a constant. Then there exist t0 and t2 , with 0 t0 < t1 < t2 , and an r (t1 ) R, r (t1 ) > 0, such that for any y Br(t1 ) ,

continuous isolated solution of F (, t) = 0, where F (, t) is C 2 in , and C 1 in t. Let (z, t) := F ( (t) + z, t). Assume that D1 F (0, t) is nonsingular, and that D 2 F F 1 (0, t) is

Theorem 2.2.13 Fixed Jacobian Inverse as a Dynamic Inverse. Let (t) be a

Fixed Jacobian Dynamic Inverse (y, t1 )1 w G[ w ] = D 1 F (2.39)

(z, t) for all z Br(t ) and all t (t0 , t2 ). is a dynamic inverse of F 1 Remark 2.2.14 Theorem 2.2.13 tells us that over a suciently small time interval, there is an open set of constant matrices such that if M is in that set, then G[w ] := M w is a dynamic inverse of F (, t). See Figure 2.7.
z2

G[w] = D1F(y, t1)-1 w y

B r(t1)
r(t1) z1

t0

t1

t2

B r(t1) x (t0, t2)


(y, t1 )1 provides a dynamic inverse Figure 2.7: For any y Br(t1 ) , the constant matrix D1 F for F (, t) over a suciently small interval (t0 , t2 ) containing t1 . See Theorem 2.2.13.

Remark 2.2.15 Nearby Jacobian Inverse as a Dynamic Inverse. We may replace t1 by t in (2.39) to conclude from Theorem 2.2.13 that D1 F ((t), t)1 w is a dynamic

Sec. 2.2

A Dynamic Inverse

23

inverse of F (, t) for all t 0, if (t) is suciently close to (t) for all t 0. This will prove particularly important later when we use the dynamic inverse in a dynamic context in order to keep (t) close to (t). (0, t) = 0 for all t, if F (z, t) is C k in z , then Proof of Theorem 2.2.13: Note that since F l D2 F (0, t1 ) 0 for l k (see Appendix A for explanations of notation). Using this, we (z, t) in a Taylor series in both variables to get expand F (z, t) = D1 F (0, t1) z + O F z 2 , |t t1 | z (2.40)

(0, t1) about y as For r > 0, let y Br Rn and expand D1 F (0, t1) = D1 F (y, t1 ) + O( y ) D1 F Substitute (2.41) into (2.40) to get (z, t) = D1 F (y, t1 ) z + f (z, t) F where f (z, t) = O z 2 , |t t1 | z , y z (2.43) (2.42) (2.41)

Now consider the dynamic inverse candidate (y, t1 )1 w G[ w ] = D 1 F (z, t)] by z T and expand F according to (2.42) to get Left multiply G[F (z, t) zT G F (y, t1)1 F (z, t) = z T D1 F (y, t1 )1 f (z, t) = z T z + z T D1 F for all z Br and all t (t0 , t2 ), (2.45) (2.44)

Choose (0, 1). If there exists an r R+ and an interval (t0 , t2 ) containing t1 such that (y, t1 )1 f (z, t) ( 1) z z T D1 F then (z, t)] z z T G[ F (y, t1 )1 f (z, t) = O D1 F
2 2 2

(2.46)

(2.47)

on Br for t (t0 , t2 ). Since f (z, t) satises (2.43), implying that G is a dynamic inverse of F z 2 , |t t1 | z , y z (2.48)

Thus for each t1 > 0 there is always a suciently small r (t1 ) > 0, and a suciently small interval (t0 , t2 ) such that (2.46) is true for the chosen .

24

Dynamic Inversion of Nonlinear Maps

Chap. 2

Remark 2.2.16 Positive-Denite Combinations with the Jacobian Inverse. Let uous in t, and F (, t) be C 1 in . Then any matrix valued function B (, t) Rnn such that B is continB ( , t)D1 F ( , t) > 0 (2.49)

G[w, , t] = B (, t) w is a dynamic inverse of F (, t) for all suciently close to . This includes as special cases B (t) = D1 F (, t)1 and B (t) = D1 F (, t)T , where is suciently small. Though it will often be convenient to choose a linear dynamic inverse, a dynamic inverse need not be linear as shown by the following two examples. Example 2.2.17 Nonlinear Dynamic Inverse. Let F (, t) = ( sin(t))3 so that = sin(t). Note that F (, t) fails to satisfy the conditions of Lemma 2.2.11. Let G[w ] := z T G[F (z + , t)] = z T z z
2

sign(w )|w |1/3. Then

(2.50)

so G[w ] is a dynamic inverse of F (, t). Note that, though G[w ] itself is not Lipschitz in w , G[F (, t)] = sin(t) which is Lipschitz in and continuous in t, thus G[w ] satises items ii and iii of the dynamic inverse denition, Denition 2.2.1. Later in Section 2.4 we will show how a dynamic inverse can be determined dynamically, that is, we will nd both the root and the dynamic inverse itself using a single dynamic system.

2.3

Dynamic Inversion
In this section we will use the dynamic inverse to construct a dynamic system

whose state is an estimator for the root (t) of F (, t) = 0. We present two theorems, Theorem 2.3.1 and Theorem 2.3.5, collectively called the dynamic inversion theorem, covering (t) as well as the case in which we do have both the case where we have no estimate of such an estimate.

2.3.1

Dynamic Inversion with Bounded Error


Suppose we wish to produce an estimate for the root (t) of a map F : Rn R+

Rn . Assume that we have a rough estimate of (0), called 0 , with 0 (0) Br , but no

Sec. 2.3

Dynamic Inversion

25

. Theorem 2.3.1 below tells us that once we have found a dynamic inverse estimator for G[w, , t] we are guaranteed that there always exists a real > 0 such that the solution (t) to = G [F (, t), , t] , (0) = 0 (2.51)

approximates (t) arbitrarily closely in an arbitrarily short period of time. The following theorem is quite general in that it allows us to nd roots of continuous, but not necessarily dierentiable nonlinear maps. Theorem 2.3.1 Dynamic Inversion TheoremBounded Error. For F : Rn R+ Rn ; (, t) F (, t), let (t) be a continuous isolated solution of F (, t) = 0 for all t R+ . parameter for all such that (t) is in Br for all t R+ . Let E : Rn R+ Rn ; Let G : Rn Rn R+ Rn ; (w, , t) G[w, , t] be a dynamic inverse of F (, t) with (, t) E (, t), be Lipschitz in and piecewise-continuous in t. Assume that there exists a

R+ such that

(t) E (z + (t), t)

/2

(2.52)

> 0, the solution (t) to

for all t R+ , and z Br . Assume also that (0) (0)

is in Br . Then for each

Dynamic Inverter with Bounded Error = G [F (, t), , t] + E (, t) satises z (t) where z := and
2

(2.53)

z (t)

z (0) 2et/2 , 0 t t1
2

/,

t > t1

(2.54)

t1 =

2 ln

z (0)

(2.55)

Proof of Theorem 2.3.1: Let z (t) := (t) (t), and assume 0. Transforming (2.53) (z, t) := F (z + (t), t) and G [w, z, t] = G[w, z + (t), t] gives to z -coordinates and letting F (t) F (z, t), z + (t), t + E (z + (t), t) z = G (2.56)

26

Dynamic Inversion of Nonlinear Maps

Chap. 2

Since G[F (, t), , t] and E (, t) are Lipschitz in and piecewise-continuous in t, a solution z (t), t R+ exists for (2.53). Let Then V (z ) = 1 z 2
2 2

(2.57)

(z ) = z T z V (t) (z, t), z, t] + z T E (z + (t), t) = z T G[F

(2.58)

By (2.52),

(t) z T 1 z z T E (z + (t), t) 2 z

=
2

1 z (t) 2

(2.59)

Combining (2.58) and (2.59) along with the assumption that G[w, , t] is a dynamic inverse (z, t) with parameter , we have of F (z ) z (t) V = 1 2
2 2 2

+1 2 z (t)
2 2

2 2 2

z (t)

1 2

z (t)

+1 2 z (t)

(2.60)
2

Therefore, for z (t) satisfying z (t)

/,
2 2

(z ) 1 z (t) V 2

= V (z )

(2.61)

Let y (t) satisfy y (0) = V (z (0)), and y = y . Then y (t) = y (0)et = 1 t z (0) 2 2e 2 (2.62)

By Theorem B.1.1 (see Appendix B), V (z (t)) y (t). As a consequence, z (t) for all t such that z (t)
2 2

z (0) 2 et/2

(2.63)

B/ , z (t) can never leave B/ . If z (0) B/ , then z (t) is guaranteed to enter B/ no later than t1 , where t1 is the solution to = z (0) namely (2.55). et1 /2

0 on the boundary of / . If z (0) B/ , then since V

(2.64)

. It may also The map E (, t) in Theorem 2.3.1 may model an estimator for model errors resulting from the representation of F (, t) or the presence of noise. Remark 2.3.2 Note that dierentiability of F (, t) is not a requirement for application of Theorem 2.3.1.

Sec. 2.3

Dynamic Inversion

27

Example 2.3.3 Dynamic Inversion of a Piecewise Linear Map No Derivative Estimate. Consider the map F : [4, 4] R R dened by 4 /2, < 2 4 /2, > 2

F (, t) = sin(4t) +

3/2, 2 2

(2.65)

as shown by the solid line in Figure 2.8. Clearly F (, t) is not dierentiable with respect to for all [4, 4].
4

F(, t0 ), t0 = 0, 1/8, 3/8

-1

-2

-3

-4 -4

-3

-2

-1

Figure 2.8: The function F (, t) (2.65) for t = 0 (solid), t = 1/8 (dotted), and t = 3/8 (dashed). The unique solution of F (, t) = 0 in [4, 4] is (t) = (2/3) sin(4t). A dynamic

inverse of F (, t) is G[w, , t] = w corresponding to = 1. A dynamic inverter for F is then = F (, t)

(2.66)

for any real constant > 0, where F (, t) is dened by (2.65). For this example we take E (, t). E (, t) 0, though in a later example, Example 2.3.7, we will construct and use a non-zero

28
3

Dynamic Inversion of Nonlinear Maps

Chap. 2

2.5

1.5

0.5

-0.5

-1

0.5

1.5

t
3

2.5

|| * ||

1.5

0.5

-0.5

0.5

1.5

Figure 2.9: The upper graph show the solutions of the dynamic inverter (2.66) for = 10 (dashed) and = 100 (solid). The initial condition was (0) = 3. The lower graph shows the estimation error for the dynamic inverter (2.66) using = 10 (dashed) and = 100 (solid).

The top graph of Figure 2.9 shows the simulated solutions of (2.66) for (0) = 3, with = 10 and = 100. The simulations were done in Matlab [Mat92] using an adaptive step-size fourth and fth order Runge-Kutta integrator ode45. Each solution can be seen to

Sec. 2.3

Dynamic Inversion

29

converge to a neighborhood of (t); the higher the value of , the smaller the neighborhood and the faster the convergence. The estimation error for each of the simulations is shown in the bottom graph of Figure 2.9. As an analog computational paradigm, it is natural to consider the realization of a dynamic inverter in an analog circuit. Example 2.3.4 A Dynamic Inverter Circuit. Consider a nonlinear circuit element, such as a diode, represented schematically in Figure 2.10.

i Va Vb

Figure 2.10: Nonlinear circuit element of Example 2.3.4. Assume that the circuit element is characterized by i = f (V a V b ) (2.67)

where i is the current through the circuit element, and Va and Vb are the voltages at each end of the circuit element as indicated in Figure 2.10. Assume that the characteristic of the circuit element is continuous, strictly monotonic, and lies in the shaded region of the graph of Figure 2.11.

f (Va - Vb)

Va - Vb

Figure 2.11: The characteristic Va Vb versus f (Va Vb ) is strictly monotonic, continuous, and lies in the shaded region. A typical curve is shown. See Example 2.3.4.

30

Dynamic Inversion of Nonlinear Maps


R6 C R1 R3 Vout i V1 R2

Chap. 2

+
R2 R3 R5 R4 Vin

V3

V2

Figure 2.12: Circuit realization of a dynamic inverter. See Example 2.3.4. Now consider the circuit of Figure 2.12 composed of linear resistors, ideal operational ampliers6 , and the nonlinear circuit element with characteristic (2.67). The circuit of Figure 2.12 is composed of a number of standard operational amplier sub-circuits: an integrator, a current to voltage converter, an inverting amplier, and a dierential amplier7 . To solve for Vout in terms of Vin note the following: Vout = 1 R6 C
t

V3 dt + Vout(0)
0

(2.68)

where Vout(0) is due to any charge on the capacitor at t = 0, V3 = R3 (V 2 V 1 ) R2 (2.69) (2.70) (2.71) (2.72)

V1 = R1 i V2 =

R5 Vin R4

i = f (Vout)

6 For an ideal operational amplier, the open-loop gain of the amplier is innite, and no current ows into the + or terminals. See [CDK87], page 175. 7 For a review of the characteristics of such circuits see Chua, Desoer, and Kuh [CDK87], Chapter 4

Sec. 2.3

Dynamic Inversion

31

Substitute Equations (2.70), (2.71), and (2.72) into Equation (2.69), let R1 = R5 R4 (2.73)

and then substitute the resulting expression for V3 into (2.68) to get Vout = Let = Now dierentiate Equation (2.74) to get out = (f (Vout) Vin ) V (2.76) R3 R5 R2 R4 R6
t 0

(f (Vout) Vin ) dt + Vout(0) R3 R5 R2 R4 R6

(2.74)

(2.75)

The dierential equation (2.76) is a dynamic inverter which solves for satisfying f (, t) = Vin , thus the circuit of Figure 2.12 is a realization of a dynamic inverter for the nonlinear in() suciently small, circuit element characterized by (2.67). For suciently high , V

and after a transient, the relationship between Vin and Vout is approximately characterized

relation between Vin and Vout approximates the inverse of the nonlinear characteristic.
Vout

by the inverse of the characteristic of the nonlinear circuit element as indicated in Fig in () , the better the ure 2.13. The larger the value of and the smaller the bound on V

Vin

Figure 2.13: Eective characteristic (solid) of the dynamic inverter circuit of Figure 2.12. The nonlinear elements characteristic is indicated in gray. See Example 2.3.4. Of course, practical realizations of such circuits as that of Figure 2.12 normally require modication in order to compensate for temperature uctuations and non-ideal properties of the operational ampliers.

32

Dynamic Inversion of Nonlinear Maps

Chap. 2

2.3.2

Dynamic Inversion with Vanishing Error

It is often the case that a dierentiable representation of F (, t) is available. Under may be obtained. Dierentiate F ( (t), t) = 0 this condition an estimator, E (, t), for with respect to t to get the identity (t) + D2 F ( , t) = 0 D1 F ( (t), t) , and replace by to get the derivative estimator Solve for E (, t) := D1 F (, t)1 D2 F (, t) (2.78) (2.77)

(t) as apIf F (, t) is C 2 in , then E (, t) becomes arbitrarily precise estimator of as are also proaches (t). Other approximators E (, t) satisfying E (, t) possible as we will see in Section 2.4. Derivative estimators of this sort may be incorporated into dynamic inversion in order to produce an estimator (t) for (t) that is not only attracted to a neighborhood of (t) in nite time, as in the case of Theorem 2.3.1, but is attracted to (t) itself as t . We will formalize this fact in the following theorem. Theorem 2.3.5 Dynamic Inversion Theorem Vanishing Error. Let (t) be a

Assume that G : Rn Rn R+ Rn ; (w, , t) G[w, , t], is a dynamic inverse of F (, t) and continuous in t. Assume that for some constant (0, ), E (, t) satises (t) E (z + (t), t)

continuous isolated solution of F (, t) = 0, with F : Rn R+ Rn ; (, t) F (, t).

on Br , for some nite > 0. Let E : Rn R+ Rn ; (, t) E (, t) be locally Lipschitz in

(2.79)

for all z Br . Let (t) denote the solution to the system Dynamic Inverter with Vanishing Error = G [F (, t) , , t] + E (, t) (2.80)

Sec. 2.3

Dynamic Inversion

33

with initial condition (0) satisfying (0) (0) Br . Then (t) (t)
2

(0) (0)

e( )t

(2.81)

for all t R+ , and in particular if > / , then (t) converges to (t) exponentially as t . (z, t) := F (z + (t), t), and Proof of Theorem 2.3.5: Let z (t) := (t) (t), F [w, z, t] := G[w, z + (t), t]. Dierentiate z (t) = (t) (t) with respect to t, and G to get substitute (2.80) for F (z, t), z, t + E (z + (t), t) (t) z = G Let V (z ) := Dierentiate V with respect to t to get d (t) F (z, t), z, t + z T E (z + (t), t) V (z ) = z T G dt Then by Denition 2.2.1 and (2.79), d V (z ) z dt so that for z Br ,
2 2

(2.82)

1 z 2

2 2

(2.83)

(2.84)

+ z

2 2

= ( ) z

2 2

(2.85)

d V (z ) 2( )V (z ) dt Then by Theorem B.1.1 of Appendix B, V (z ) V (0)e2( )t

(2.86)

(2.87)

Substitute the right-hand side of (2.83) for V (z ) to conclude z (t) z (0) for all t 0. Remark 2.3.6 Note that as in Theorem 2.3.1, dierentiability of F (, t) is not required for application of Theorem 2.3.5, though when F (, t) is dierentiable we may construct E (, t) as in (2.78).
2

e( )t

(2.88)

34

Dynamic Inversion of Nonlinear Maps

Chap. 2

Example 2.3.7 Dynamic Inversion of a Piecewise Linear Map With Derivative the time-derivative of the solution = (4/3) sin(4t) to F (, t) = 0. Use the dynamic = F (, t) + E (, t) (2.89) Estimate. Let F and G be as in Example 2.3.3. Let E (, t) := (8/3) cos(4t) which is

inverter

with the same initial condition as before, (0) = 3. The top graph of Figure 2.14 shows the simulation results, and the bottom graph of Figure 2.14 shows the estimation error. In this case the errors can be seen to go to zero exponentially. Note that for both dynamic inverters (2.66) and (2.89), each of the solutions (t) pass through the point = 2, a local maximum of F , for which F ((t), t) is not dierentiable. In contrast, Newtons method and the gradient method are undened for non-dierentiable functions, and even if we were to make F (, t) dierentiable in by smoothing it, Newtons method would fail due to the local maximum at = 2.

Sec. 2.3

Dynamic Inversion

35

2.5

1.5

0.5

-0.5

-1

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

t
3

2.5

|| * ||

1.5

0.5

-0.5

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Figure 2.14: The top graph shows solutions of the dynamic inverter (2.89) with E (, t) = (t) for = 10 (dashed) and = 100 (solid), with the actual solution (t) (dotted). The initial condition was (0) = 3. The bottom graph shows the corresponding estimation error.

36

Dynamic Inversion of Nonlinear Maps

Chap. 2

(t) in Example 2.3.7, this is most often Though we have assumed knowledge of not a practical assumption. In the next example we show how the derivative estimator (2.78) may be used in the context of Theorem 2.3.5. Example 2.3.8 Dynamic Inversion with a Derivative Estimate. Let F (, t) = (2 + sin(t)) tan() cos(t) Then D1 F (, t) = (2 + sin(t)) sec2 (). Using the derivative estimator (2.78) gives E (, t) = for (t) is = G[F (, t)] + E (, t) = ((2 + sin(t)) tan() cos(t))
(sin(t)+cos(t) tan()) cos2 () 2+sin(t)

(2.90)

(sin(t) + cos(t) tan()) cos2 () 2 + sin(t)

(2.91)

Let G[w ] = w which corresponds to the dynamic inverse (2.24). Then a dynamic inverter

(2.92)

Figure 2.15 shows the results of a simulation of the dynamic inverter (2.92) for (0) = 1. The top graph of Figure 2.15 shows (t) and (t) (dotted). The bottom graph shows the absolute value of the error between (t) and the solution (t) = arctan Note that (0) = (0). Remark 2.3.9 Dynamic Inversion with Perfect Initial Conditions. If (0) = (0), then the conditions of Theorem 2.3.5 guarantee that (t) (t) for all t R+ . So in a sense, we need only solve the inverse problem at a single instant t = 0. Then the dynamic inversion takes care of maintaining the inversion for all t. Remark 2.3.10 Maintenance of a State-Dependent Jacobian Inverse as Dynamic Inverse. Let G[w, , t] := D1 F (, t)1 w (2.94) cos(t) 2 + sin(t) (2.93)

where is the state of a dynamic inverter. It follows from Lemma 2.2.11 and Theorem 2.3.5 that if is suciently large, (0) (0) is suciently small, and G[w, (0), 0] is a dynamic inverse of F (, t) at t = 0, then G[w, , t] is a dynamic inverse of F (, t) for all t > 0 (See also Remark 2.2.15).

Sec. 2.4
1 0.5

Dynamic Estimation of a Dynamic Inverse

37

0 -0.5 -1

t
0.6

| |

0.4

0.2

0 0

Figure 2.15: The top graph shows the state trajectory (t) (solid) of the dynamic inverter (2.92), along with the solution (t) (dotted). The bottom graph shows the error norm |(t) (t)|. Example 2.3.11 will illustrate application of Remark 2.3.10 to the estimation of (t). Example 2.3.11 Dynamic Inversion Using State-Dependent Jacobian Inverse. Let w and be in Rn . Assume that the assumptions of Lemma 2.2.11 hold. We may obtain from (2.78). Assume that r has been chosen suciently small, an estimator E (, t) for and that D2 F (, t) is suciently bounded so that E (, t) satises (2.79) for all z Br . Let G[w, , t] := D1 F (, t)1 w (2.95)

and assume that r is small enough that G is a dynamic inverse of F on Br . If ((0) (0)) Br , and is suciently large, then by Theorem 2.3.5 the approximation error z (t) := (t) (t) using (2.80) will converge exponentially to zero.

2.4

Dynamic Estimation of a Dynamic Inverse


In this section we will show how we can apply the dynamic inversion theorem to

the construction of a dynamic system whose state includes both a dynamic inverse of a

38

Dynamic Inversion of Nonlinear Maps

Chap. 2

particular F as well as an approximation for the root of F . Consideration of the example of dynamic inversion of a time-varying matrix [GM95b] will lead the way8 . Example 2.4.1 Inversion of Time-Varying Matrices. Consider the problem of estidenotes the group of invertible matrices in Rnn . Assume that we have representations for (t), and that A(t) is C 1 in t. Let be an element of Rnn . both A(t) and A In order for to be the inverse of A(t), must satisfy A(t) I = 0 Let F : Rnn R+ Rnn ; (, t) F (, t) be dened by F (, t) := A(t) I (2.97) (2.96) mating the inverse (t) Rnn of a time-varying matrix A(t) GL(n, R), where GL(n, R)

As usual we will refer to the solution of F (, t) = 0 as (t). To obtain an estimator E (, t) (t), dierentiate A = I with respect to t, solve the resulting expression for , for replace A1 by , and then replace by in the resulting expression to get (t) E (, t) := A Dierentiate F (, t) with respect to to get D1 F (, t) = A(t) whose inverse is . So a choice of dynamic inverse is G[w, , t] := w (2.100) (2.99)

(2.98)

for suciently close to = A1 (t) and with w Rnn . The dynamic inverter for this problem then takes the form = G [F (, t), ] + E (, t)
8

(2.101)

In Chapter 3 we will cover dynamic matrix inversion in more depth.

Sec. 2.4 or, expanded,

Dynamic Estimation of a Dynamic Inverse

39

= (A(t) I ) A (t)

(2.102)

and we choose as initial conditions (0) (0) = A1 (0) so that the estimation error starts small. Theorem 2.3.5 guarantees that for suciently large , and for (0) suciently close to A1 (0), equation (2.102) will produce an estimator (t) whose error (t) (t) decays exponentially to zero at a rate determined by our choice of . Even if we dont know (t), we can, by Theorem 2.3.1, take E (, t) to be identically zero and achieve inversion A with a bounded error. Remark 2.4.2 Inversion of Time-Varying Matrices. Example 2.4.1 allows one to invert time-varying matrices without calling upon discrete matrix inversion routines. One only need calculate or approximate a single inverse, A(0)1 . The ow of (2.102) then takes care of the inversion for all t > 0. Remark 2.4.2 will be expanded in Section 3.2 of Chapter 3. Remark 2.4.3 Notation. In the following example and in the remainder of this section we will couple two dynamic inverters together; one which estimates the solution of F (, t) = 0, and the other which solves for a matrix to be used in a dynamic inverse G[w, ] of F (, t). In order to distinguish between the map, dynamic inverse, and derivative estimator for each of the two problems we adopt the convention of referring to the map, dynamic inverse, and derivative estimator for by F , G , and E respectively, retaining the designations F , G, and E for the map, dynamic inverse, and derivative estimator for . Example 2.4.4 Obtaining a Dynamic Inverse Dynamically. Assume that F (, t) that D1 F (, t) is C 2 in and C 1 in t. Let Rnn denote an estimator for D1 F ( , t)1 . satises the assumptions of Lemma 2.2.11 with continuous isolated solution (t). Assume

40

Dynamic Inversion of Nonlinear Maps

Chap. 2

(t) as follows: Dierentiate F ( , t) = 0, solve for , and substitute We may then estimate in terms of , , and t, for D1 F ( (t), t)1 and for to obtain an estimator for E (, , t) := D2 F (, t) (2.103)

Assume that E (, , t) is Lipschitz in and , and piecewise-continuous in t. with Using E (, , t) = [Ei(, , t)]in, and similar to (2.98) we estimate E (, , t) := d D1 F (, t) dt

=E (,,t)

(2.104)

where d D1 F (, t) dt In this case, F (, , t) := D1 F (, t) I Let G[W, ] := W as in Example 2.4.1 (see (2.100)), with W Rnn . nonlinear dierential equations = 0 0 F (, , t) F (, t) + E (, , t) E (, , t) (2.108) (2.107) (2.106)
n

:=
=E (,,t) i=1

D1 F (, t) D1 F (, t) Ei(, , t) + . i t

(2.105)

Theorem 2.3.5 now tells us that we may estimate (t) with the system of coupled

with guaranteed exponential convergence of (, ) to ( , ). After a denition, we summarize the result of Example 2.4.4 with a theorem. Denition 2.4.5 For (, ) Rnn Rn , dene the norm (, ) 1 / 2
n n 2

by (2.109)

(, ) 2 :=

i,j =1

|ij | +

i=1

|i |2

Sec. 2.4

Dynamic Estimation of a Dynamic Inverse

41

Norm (2.109) is thus the l2 norm of the matrix [, ] where we consider to be a column vector. Theorem 2.4.6 Dynamic Inversion with Dynamic Determination of a Dynamic Inverse. Let F (, t) satisfy the assumptions of Lemma 2.2.11. Then for (0) suciently close to D1 F ( , 0)1 , and (0) suciently close to (0), the solution ( (t), (t)) of D1 F (, t) I F (, t) (2.110)

= +

d = D2 F (,t) dt D1 F (, t)

D2 F (, t)

> 0, the convergence is exponential, i.e. there exist k1 > 0 and k2 > 0 such that ( (t), (t)) ( (t), (t))
2

satises ( (t), (t)) (D1 F ( , t)1 , (t)) as t . Furthermore, for suciently large k1 ( (0), (0)) ( (0), (0)) 2ek2 t (2.111)

for all t 0, where (t) = D1 F ( (t), t)1 . Proof of Theorem 2.4.6: Let [(W, w ), ] := G Note that ((, ), t) = D1 F D1 F (, t) 0 D1 F (, t) (2.113) 0 0 W w ((, ), t) := and F F (, , t) F (, t) (2.112)

where indicates an unspecied n n block matrix. If is suciently close to = ((, ), t) is positive denite. Thus D1 F ( (t), t)1 , then the product of diag(, ) and D1 F [(W, w ), ] is a dynamic inverse of F ((, ), t) for (, ) suciently close to ( , ). It G

in its arguments. Hence, for (0) suciently close to D1 F ( , 0)1 , and (0) suciently [(W, w ), ] is a dynamic inverse of F ((, ), t). Also for close to (0), G ((, ), t) := E E (, , t) E (, , t) (2.114)

[(W, w ), ] = D1 F (( , ), t)1 [W T , w T ]T , and G [(W, w ), ] is continuous follows that G

42

Dynamic Inversion of Nonlinear Maps

Chap. 2

) and E (( , ), t) = ( , (( , ), t) is continuous in (, ). Therefore, we have that E by Theorem 2.3.5, for suciently large > 0, equation (2.110) is a dynamic inverter for (, ), and (, ) converges exponentially to ( , ). In all of the preceding examples in which dynamic inversion was applied to approximate a root (t), a closed-form expression for (t) has been available by inspection of F (, t). This has facilitated verication that dynamic inverters do what they are supposed to do. In the following example of dynamic inversion (t) is not so easily determined analytically. Example 2.4.7 Tracking Intersections of Curves. Consider the two time-dependent cubic curves in the x, y plane, 1 y = (2 + sin(t))x3 + (1 + 3 sin( 2t))x (2.115)

1 y = (2 + sin(3t))x3 + (1 + 4 sin2 (5t))x

For each t 0 it is readily veried that these curves intersect at three points: one point is

the origin, one is to the right of (0, 0) at (t) = (x (t), y (t)), and one is to the left of the

origin at (x (t), y (t)). Figure 2.16 shows the two curves and their intersections for six values of t and for x 0. Let

F (, t) = F ((x, y ), t) =

y (2 + sin(t))x3 (1 +

y + (2 + sin(3t))x3 (1 +

1 3 sin( 2t))x 1 2 4 sin (5t))x

(2.116)

x = 0; the other solutions are (x (t), y (t)) = (0, 0) and (x (t), y (t)) = (t). We will use Theorem 2.4.6 to track the solution (t). In this case, R22 , D1 F ((x, y ), t) =

We will be interested in the solution (t) = (x (t), y (t)) of F (, t) = 0 to the right of

1 3(2 + sin(t))x2 (1 + 3 sin( 2 t)) 1 3(2 + sin(3t))x2 (1 + cos(t)x3


2 3 1 4

sin2 (5t))

(2.117)

D2 F ((x, y ), t) =

cos( 2 t)x

3 cos(3t)x3

5 2

sin(5t) cos(5t)x E1 E2

(2.118)

E (, (x, y ), t) = D2 F ((x, y ), t) =:

(2.119)

Sec. 2.4

Dynamic Estimation of a Dynamic Inverse


t=0 2 0 -2 2 0 -2 t=1

43

0.5 x t=2

1.5

0.5 x t=3

1.5

2 0 -2

2 0 -2

0.5 x t=4

1.5

0.5 x t=5

1.5

2 0 -2

2 0 -2

0.5 x

1.5

0.5 x

1.5

Figure 2.16: The solution of interest in Example 2.4.7, (t) = (x (t), y (t)), is the intersection (to the right of (0, 0)) of the two cubic curves shown in each of the graphs. This gure shows the pair of cubic curves (2.115) for t {0, 1, . . ., 5}. and
d dt D1 F ((x, y ), t)

9 cos(3t)x2

3 cos(t)x2 6(2 + sin(t))xE 1 + 6(2 + sin(3t))xE 1

2 3

cos( 2 t)

(2.120)

sin(5t) cos(5t) 0

5 2

From (2.116), (2.117), (2.118), and (2.120) we can construct the dynamic inverter (2.110). When t = 0, the root (to the right of (0, 0)) can be obtained by inspection 1 as (x (0), y(0)) = ( , 0). Thus we could use (x(0), y (0)) = (1/ 2, 0) and (0) = 2

D1 F ((x(0), y (0)), t)1 for initial conditions for the dynamic inverter to produce the exact9 choose initial conditions

solution (x (t), y (t)) for all t 0. In order to demonstrate an error transient, however, we x(0) y (0)
9

1 0

(0) =

1/4 1/4 1/2

1/2

(2.121)

By exact for the simulation, we mean exact up to the tolerance of the integrator which, in this example, was 106 .

44

Dynamic Inversion of Nonlinear Maps

Chap. 2

Figure 2.17 shows the results of a simulation of the dynamic inverter using the adaptive step-size Runge-Kutta integrator ode45 from Matlab [Mat92], with = 10. The upper graph shows x(t) (solid) and y (t) (dashed) versus t. The lower graph shows (x(t), y (t)) for t [0, 10]. The root (t) = (x (t), y (t)) is a quasi-periodic curve. Note that if we were to change 2 to 2, for instance, in (2.115) and (2.116), the solution would have a period of 2 . Figure 2.18 shows the log of the approximation error as seen through F , namely log10 F ((x(t), y (t)), t) tolerance, 106 , within 2 seconds.
.

The error can be seen to decay to the level of the integrator

Sec. 2.4

Dynamic Estimation of a Dynamic Inverse

45

0.8

0.6

x (solid), y (dashed)

0.4

0.2

-0.2

-0.4

10

0.5 0.4 0.3 0.2

0.1 0 -0.1 -0.2 -0.3 -0.4 0.5

0.6

0.7

0.8

0.9

1.1

Figure 2.17: The solution of the dynamic inverter of Example 2.4.7 for F (, t) = 0 corresponding to Example 2.4.7, where = (x, y ). The upper graph shows x(t) versus t (solid) and y (t) versus t (dashed). The lower graph shows x(t) versus y (t) with the initial condition (x(0), y (0)) = (1, 0) marked by the small circle.

46

Dynamic Inversion of Nonlinear Maps

Chap. 2

0 -1 -2 -3

log10||F (, t )||

-4 -5 -6 -7 -8 -9

10

Figure 2.18: The estimation error for the dynamic inverter of Example 2.4.7 as seen through F (2.116), log10 F ((t), t) versus t in seconds. See Example 2.4.7.

Sec. 2.4

Dynamic Estimation of a Dynamic Inverse

47

An example of application of Theorem 2.4.6 to the inversion of robot kinematics will be given in Chapter 5. In the closing example of this chapter we apply Theorem 2.4.6 to the solution of a standard problem in the control of nonlinear systems. Example 2.4.8 Dynamic Inversion of a Nonlinear Control System. Consider the multi-input, multi-output, time-varying nonlinear control system x = f (x, t, u) (2.122)

with x and u in Rn . Assume that f (x, t, u) is C 2 in its arguments. Let x(t) denote the solution of (2.122). Assume also that D3 f (x, t, u) is invertible for all (x, u) in a neighborhood of (0, 0) Rn Rn and for all t 0. Consider also the vector eld (x, t) assumed to be C 2 in x and t. Suppose we wish to solve for u such that (x, t) = f (x, t, u) i.e. we wish to solve for a u() that will cause the state x(t) to obey the dynamics x = (x, t) Dene F (u, t) := f (x(t), t, u) (x(t), t) Let u (t) be a continuous isolated solution, assumed to exist, of F (u, t) = 0. Let F (, u, t) := D3 f (x(t), t, u) I so that (t) = D3 f (x(t), t, u)1. As in Theorem 2.4.6 let G[ w ] = w with w Rn , and G [ W ] = W (2.127) (2.126) (2.125) (2.124)

(2.123)

(2.128)

with W Rnn . To solve for an estimator E (, u, t) for u , we dierentiate F (u, t) = 0 with respect to t, solve for u , and replace D1 F (u, t)1 by and x by f (x, u) to get E (, u, t) = ((D1 f (x(t), t, u) D1 (x(t), t)) f (x, t, u) +D2 f (x, t, u) D2 (x, t))

(2.129)

48

Dynamic Inversion of Nonlinear Maps

Chap. 2

, dierentiate F (, u, t) = 0 with Similarly, to solve for an estimator E (, u, t) for , replace D2 f (x, t, u)1 by , x respect to t, solve for by f (x, t, u), and u by E (, u, t) to get E (, u, t) = (D1,3 f (x(t), t, u) f (x(t), u) + D2,3 f (x(t), t, u) +D3,3 f (x(t), t, u) E (, u, t)) (2.130)

If u(0) and (0) are suciently close to u (0) and D3 f (x(0), 0, u(0))1 respectively, then a dynamic compensator which produces a u that converges exponentially toward u is u = F (u, t) + E (, u, t) = F (, u, t) + E (, u, t) (2.131)

Furthermore, if we choose u(0) to satisfy F (u(0), 0) = 0, and (0) to satisfy (0) = closed-loop system including the dynamic inversion compensator, and the original nonlinear plant (2.122). D2 f (x(0), u(0), 0)1, then x(t) will satisfy (2.124) for all t 0. Figure 2.19 shows the

, u

Dynamic Inverter . = G[F(,u,t)]+E(,u,t) . u = G[F(u,t)]+E(,u,t)

u x

Nonlinear Plant . x = f(x,t,u)

x(t)

Figure 2.19: The closed-loop system with dynamic inversion compensator (2.131) with state (, u) and the nonlinear plant (2.122) with state x. Note that if a convenient closed form exists for (D3 f (x, t, u))1 , or if one is satised to use discrete numerical matrix inversion, one could replace by (D3 f (x, t, u))1 and equations. eliminate the

2.5

Generalizations of Dynamic Inversion


The dynamic inversion theorems, Theorems 2.3.1 and 2.3.5, rely upon the use

of quadratic Lyapunov functions, and indeed, the denition of a dynamic inverse, Denition 2.2.1, is tailored for association with a quadratic Lyapunov function. We may generalize dynamic inversion based on more general Lyapunov functions. For instance, consider the following denition.

Sec. 2.5

Generalizations of Dynamic Inversion

49

Denition 2.5.1 General Dynamic Inverse. For F : Rn R+ Rn ; (, t) F (, t)

let (t) be a continuous isolated solution of F (, t) = 0. A map G : Rn Rn R+ Rn ; (w, t) G[w, , t] is called a dynamic inverse of F on the ball Br := {z Rn | z i. G[0, z + (t), t] = 0 for all z Br , t 0, ii. the map G[F (, t), , t] is Lipschitz in , piecewise-continuous in t, and iii. there exists a continuously dierentiable function V : [0, ) Br R; (t, z ) V (t, z ) such that for all z Br ,

r }, r > 0, if

1 ( z ) V (t, z ) 2 ( z ) (t) G[F (z + , t), z + , t] 3 ( z ) D1 V (t, z ) + D2 V (t, z ) where 1 (), 2 (), and 3 () are of class K (see Appendix A) on [0, r ).

(2.132) (2.133)

A more general dynamic inversion theorem follows from Denition 2.5.1. Theorem 2.5.2 General Dynamic Inversion Theorem. Let (t) be a continuous

F (, t) on Br . Let (t) denote the solution to the system = G [F (, t) , , t]

G : Rn Rn R+ Rn ; (w, , t) G[w, , t], is a dynamic inverse (Denition 2.5.1) of

isolated solution of F (, t) = 0, with F : Rn R+ Rn ; (, t) F (, t). Assume that

(2.134)

with initial condition (0) satisfying (0) (0) Br . If (0) (0) is in Br , then asymptotically. Proof of Theorem 2.5.2: Since G[w, , t] is assumed to be a dynamic inverse of F (, t), there exists a function V (t, z ) satisfying (2.132) and (2.133). It follows (see [Kha92], Theorem 4.1, page 169) that the origin z = 0 of the system (t) z = G[F (z + (t), t), z + (t), t] + is uniformly asymptotically stable. Thus asymptotically as t . (2.135)

50

Dynamic Inversion of Nonlinear Maps

Chap. 2

Though it is readily apparent that Denition 2.5.1 leads to a more general dynamic inversion theorem, Theorem 2.5.2, with a simple proof, use of the more general denition also imposes the generally dicult requirement of nding a Lyapunov function in order to prove that G is indeed a dynamic inverse of F . In contrast the dynamic inverse criterion of Denition 2.2.1 is often easily veried from familiarity with the inverse problem one is trying to solve. For instance, one often knows that D1 F (, t) is invertible for all suciently close to (t). In such cases Denition 2.2.1 leads easily to the constructive methods of, for instance, Theorem 2.4.6. What we would gain in generality by relying upon Denition 2.5.1 we would lose in ease of construction of dynamic inverters for a broad and useful set of inverse problems. Another consideration in our choice of dynamic inverse denition, Denition 2.2.1, is that it leads to exponentially stable systems. Exponentially stable systems are known to maintain their exponential stability under a wide variety of perturbations. This fact has been of profound value in the history of control theory, accounting, for instance, for the wide successes of the application of linear controllers to the control of nonlinear systems. When dynamic inverters are incorporated into control laws, this exponential stability allows one to call upon a variety of well-known results of stability theory in order to conclude exponential stability of the closed loop control system. We will see an example of this in Chapter 4 where we will apply dynamic inversion to construct a tracking controller that will allow tracking of implicitly dened trajectories with exponential convergence. By retaining exponential stability of a closed-loop control system we allow that control system to retain a useful level of robustness with respect to perturbations and modeling errors.

2.6

Chapter Summary
By building upon simple quadratic Lyapunov stability arguments we have devel-

oped a methodology for the construction of a class of nonlinear dynamic systems that can solve time-dependent nite-dimensional inverse problems. The notion of a dynamic inverse of a map has been introduced. We have shown a number of ways in which dynamic inverses may be obtained, perhaps the most powerful being through dynamic inversion itself, where the dynamic inverse is solved for at the same time it is being used to track the root of interest. We have shown how derivative estimation can be used to make the dierence between an ultimately bounded approximation error, and an approximation error that converges exponentially to zero.

Sec. 2.6

Chapter Summary

51

For realization of dynamic inversion on a digital computer, an integration method must be chosen. In the current digital technology integration can be slow, particularly for ordinary dierential equations of high dimension. Of course this disadvantage might be made less severe by redesigning computers to optimize integration. On the other hand the reliance of dynamic inversion on integration places all questions of accuracy and rate of convergence squarely in the lap of the chosen integration routine. Accuracy and convergence properties of discrete integrators are a well studied problem. In the following chapters we will apply dynamic inversion to a variety of problems in mathematics and nonlinear control.

52

Chapter 3

Dynamic Methods for Polar Decomposition and Inversion of Matrices


3.1 Introduction
In Chapter 2 we introduced a technique in which a dynamic system was used to generate an approximation (t) to the solution (t) of a nonlinear vector equation of the form F (, t) = 0. As we saw in Example 2.4.1, one may also pose the inverse of a timevarying matrix as a solution to an equation of the form F (, t) = 0. Square roots and other matrix functions may be posed similarly. Motivated by this realization, in this chapter we will further investigate the use of dynamic inversion to construct dynamic systems that perform matrix inversion as well as polar decomposition.

3.1.1

Previous Work
As in the case of vector equations (see Section 2.1.2), continuous-time dynamic

methods of solving matrix equations have appeared before. Any dynamic system on a matrix space for which an asymptotically stable equilibrium exists may be considered to be a dynamic inverter that solves for its equilibrium. Continuous-time dynamic methods for determining eigenvalues date back at least as far as Rutishauser [Rut54, Rut58]. We have already mentioned (see Section 2.1.2) the work of Brockett [Bro91, Bro89], who has shown how one can use matrix dierential equations to perform computation often

Sec. 3.1

Introduction

53

thought of as being intrinsically discrete, and Bloch [Blo85, Blo90] who has shown how Hamiltonian systems may be used to solve principal component and linear programming problems. Chu [Chu95] has studied the Toda ow as a continuous-time analog of the QR algorithm. Chu [Chu92] and Chu and Driessel [CD91b] have explored the use of dierential equations in solving linear algebra problems. Smith [Smi91], and Helmke et al. [HMP94] have constructed dynamical systems that perform singular-value decomposition. Dynamic methods of matrix inversion have also appeared in the articial neural network literature. See for instance Jang et al. [JLS88] and Wang [Wan93]. For a review of dynamic matrix methods as well as a comprehensive list of references for dynamic approaches to optimization see [HM94]. A dynamic decomposition related to polar decomposition of xed matrices has also appeared in Helmke and Moore [HM94], though, as the authors point out, their gradient based method does not guarantee the positive deniteness of the symmetric component of the polar decomposition. Using dynamic inversion we will derive a system that produces the desired inverse and polar decomposition products at any xed time t1 > 0 with guaranteed positive deniteness of the symmetric component. As far as we know, all prior continuous-time dynamic approaches to inversion of matrix equations use gradient ows. In contrast, dynamic inversion, as we formulate it, does not require the requisite metric needed to dene a gradient. Though we will see in Section 3.3.1 that gradient approaches t well into the dynamic inversion framework, the main results of the present paper do not rely upon a metric structure.

3.1.2

Main Results
In this chapter, using dynamic inversion, we will join constant matrices with un-

known inverses to constant matrices with known inverses through a t-parameterized path of matrices, where t may be identied with time. As the path proceeds from the matrix with the known inverse to the matrix with the unknown inverse, functions of the state of the dynamic inverter exactly track the polar decomposition factors as well as the inverse of the path element. The path is such that when t = 1 the unknown matrix is reached, hence the functions of the state of the dynamic inverter at t = 1 provide the exact desired polar decomposition factors as well as the inverse. By scaling time we may produce the exact desired inverse and the polar decomposition by any prescribed time t1 > 0. The main results of this chapter are as follows: We will construct dynamic systems

54 that

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

i. invert time-dependent matrices asymptotically, ii. invert a spectrally restricted matrix by a prescribed time, iii. invert and decompose any t-dependent invertible matrix into its polar decomposition factors, iv. invert and decompose any constant nonsingular matrix into its polar decomposition factors by a prescribed time. Result ii will be obtained from result i using homotopy. Likewise, result iv will be obtained from result iii using homotopy.

3.1.3

Chapter Overview
In Example 2.4.1 of Chapter 2 we examined the application of dynamic inversion

to the problem of inverting time-varying matrices where we assumed that a suciently good approximation existed for the inverse of the time-varying matrix at an initial time. In Section 3.2 we will show some further applications of time-varying matrix inversion. Motivated by the desire to obtain the initial inverse dynamically, in Section 3.3 we will consider the problem of inverting constant matrices. By using a matrix homotopy from the identity we will use the results of Section 3.2 to produce exact inversion of a restricted class of constant matrices, including positive denite matrices, by a prescribed time. In Section 3.4 we will consider the polar decomposition of a time-varying matrix. We will show how, starting from a good guess at the initial value of the inverse of the positive denite part of the polar decomposition, we may construct a dynamic system that produces an exponentially convergent estimate of the inverse of the positive denite symmetric part. From this estimate and the original matrix we may obtain the decomposition products as well as the inverse. Then in Section 3.5 we revisit the problem of constant matrix inversion and show how, combining homotopy with dynamic polar decomposition, we may dynamically produce the polar decomposition factors as well as the inverse of any constant matrix by a prescribed time without requiring an initial guess1 .
1

For the notation of this chapter see Section A of the Appendix.

Sec. 3.2

Inverting Time-Varying Matrices

55

3.2

Inverting Time-Varying Matrices


We summarize the results of Example 2.4.1 of Chapter 2 in the following theorem.

components (see Appendix A for notation), the group operation being matrix multiplication. Theorem 3.2.1 Dynamic Inversion of Time-Varying Matrices. Let A(t) GL(n, R) (t) bounded on [0, ). Let G[w, , t] be a dynamic be C 1 in t, with A(t), A(t)1 , and A

Recall that GL(n, R) refers to the general linear group of n n invertible matrices with real

is in Br . Let (t) Rnn be the solution to

inverse (see Denition 2.2.1) of F (, t) = A(t) I for all t R+ , and for all such that = G[A(t) I, , t] A (t) and k2 > 0 such that for all > , and for all t 0, (t) (t) In particular limt (t) = A(t)1 . Example 3.2.2 A Dynamic Inverter for a Time-Varying Matrix. Let G[w, t] := A(t)T w to A(0)1 , the solution (t) of
2

(3.1)

with (0) (0) r < . Then for suciently small r , there exists a > 0, k 1 > 0, k1 (0) (0) 2 ek2 t (3.2)

(3.3)

Then by Theorem 3.2.1, for suciently large constant > 0, and for (0) suciently close

Dynamic Inverter for a Time-Varying Matrix = A(t)T (A(t) I ) A (t) approaches A(t)1 exponentially as t . of G[w, t] = A(t)T w . See also Example 2.4.1 where the dynamic inverse G[w, ] = w is used instead (3.4)

Example 3.2.3 Dynamic Inversion of a Mass Matrix. Consider a nite dimensional mechanical system modeled by the implicit second order dierential equation M (q ) q + N (q, q ) = 0 (3.5)

56

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

Usually the matrix M (q ) is positive denite and symmetric for all q since the kinetic energy, (1/2)q T M (q )q , is greater than zero for all q > 0. It is often convenient to express such systems in an explicit form, with q alone on the left side of a second order ordinary dierential equation. To do so we will invert M (q ) dynamically. Let be a symmetric estimator for M (q )1 . Suppose we know M 1 (q (0)) approximately. If our approximation is suciently close to the true value of M 1 (q (0)), then setting (0) to that approximation, and letting > 0 be suciently large allows us to apply Theorem 3.2.1. Then the system

Dynamic Inverter for a Mass Matrix = (M (q ) I )


Mi,j (q ) q q i,j n

(3.6)

q = N (q, q )

provides an exponentially convergent estimate of q for all t. M (q (0))1 , then (t) = M 1 (q (t)) for all t 0.

Furthermore, if (0) =

Remark 3.2.4 Symmetry and the Choice of Dynamic Inverse. In Example 3.2.3, M (q ) is symmetric, as is its inverse M (q )1 . The right hand side of (3.6) is also symmetric, hence if (0) is symmetric, so will be (t) for all t. If we had chosen G[w, q ] := M (q )T w as a dynamic inverse (see, for instance, Example 3.2.2) we would not have had this symmetry. The symmetry allows us to cast the top equation of (3.6) on the space S(n, R) of symmetric n n matrices thereby reducing the complexity of the dynamic inverter with respect to the nonsymmetric case; what would otherwise be n2 equations (3.6) is reduced to s(n) := n(n + 1)/2 equations.

3.2.1

Left and Right Inversion of Time-Varying Matrices


Consider a matrix A(t) Rmn . Assume that A(t) is of full rank for all t 0.

We consider two cases: (1) If m n, then A(t) has a right inverse (t) Rnm satisfying F (, t) := A(t) I = 0 (3.7)

Sec. 3.3 It is easily veried that

Inversion of Constant Matrices

57

G[w ] := w

(3.8)

is a dynamic inverse for F (, t) when is suciently close to = A(t)T (A(t)A(t)T )1 . , and replace by to get the Dierentiate F ( , t) = 0 with respect to t, solve for derivative estimator (t) E (, t) := A Thus a dynamic inverter for right-inversion of a time-varying matrix is = (A(t) I ) A (t) (3.10) (3.9)

The form of this dynamic inverter may be seen to be identical to (2.102). Alternatively we may use Theorem 3.2.1 to invert A(t)A(t)T , constructing the right inverse as A(t)T (t). In the case that m n, A(t) has a left inverse (t) which satises F (, t) := A(t) I = 0 (3.11)

(t) replaced by We may use the dynamic inverter (3.10) with A(t) replaced by A(t)T , and A (t)T to approximate the left inverse of A(t). A

3.3

Inversion of Constant Matrices


In this section we consider two methods for the dynamic inversion of constant ma-

trices; one for asymptotic inversion, and the other for inversion in nite time. In Section 3.5, relying on the methods of Section 3.4, we will consider another more complex, but more general approach to the same problem. Constant matrices may be inverted in a manner similar to the inversion of timevarying matrices as described in the last section. Let F ( ) := M I (3.12)

Let (t) denote the estimator for the inverse of a constant matrix M , with = M 1 as is zero. As a consequence, if (0) is the solution of F ( ) = 0. Since M is constant,

suciently close to , then a dynamic inverse of F ( ) (3.12) is G[w, ] := w , and we

58

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

can use the dynamic inverter

Dynamic Inverter for Constant Square Matrices = (M I ) (3.13)

Choosing (0) suciently close to assures us that, as (t) ows to = M 1 , will not intersect the set of singular matrices.

3.3.1

A Comment on Gradient Methods


As shown in Section 3.2, Example 3.2.2, the function G[w, ] := w is not our

only choice of a dynamic inverse G[w, , t] which is linear in w . It is easily veried that G[w ] = M T w , w Rnn , is also a dynamic inverse for F ( ) := M I , and that for this choice of dynamic inverse we do not need to worry about the dynamic inverse becoming singular; it is valid globally and leads to the dynamic inverter

Dynamic Inverter for Constant Square Matrices = M T (M I ) M 1 (3.14)

Remark 3.3.1 Left and Right Inverses of Constant Matrices If M has full rowrank, with M Rmn , m n, then the equilibrium solution of (3.14) is the right inverse M R := M T (M M T )1 of M .

Dynamic Right-Inverter for Constant Matrices = M T (M I ) MR (3.15)

has full column-rank, then the solution would be the left inverse M L := (M T M )1 M T

If instead we were to choose F ( ) := M I and G[w ] := w M T , and if M Rmn , m n,

Sec. 3.3 of M .

Inversion of Constant Matrices

59

Dynamic Left-Inverter for Constant Matrices = ( M I )M T ML (3.16)

The dynamic inverter (3.14) is the standard least squares gradient ow (see [HM94], Section 1.6) for the function : Rn R; ( ) where 1 ( ) := M I 2 (3.17) 2 2 It is also the neural-network constant matrix inverter of Wang [Wan93]. Of course other gradient schemes may have the same solution as (3.14) though they may start from gradients of functions other than (3.17) (See, for instance [JLS88]). In general, articial neural networks are constructed to dynamically solve for the minimum of an energy function having a unique (at least locally) minimum, i.e. they realize gradient ows. Connecting Gradient Methods with Dynamic Inversion In general a dynamic inverter consists of three functions, F , G, and E as described in Section 2. The function F (, t) is the implicit function to be inverted, G[w, , t] is a dynamic inverse for F (, t), and E (, t) is an estimator for the derivative with respect to t of the root of F (, t) = 0. In order to relate gradient methods to dynamic inversion we consider the decomposition of a gradient ow system into an E , F , and G forming a system based on this function is dynamic inverter. For instance, let H : Rnn R R be a smooth function. A gradient

Gradient System = H (, t) + H (, t) t (3.18)

where denotes the gradient of H (, t). We may always identify gradient systems with dynamic inversion through the trivial dynamic inverse (see Property 2.2.6) G[ w ] = w (3.19)

60 Then

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

F (, t) = H (, t) and E (, t) = Let = 1. Then = G[F (, t)] + E (, t) H (, t) t

(3.20)

(3.21)

(3.22)

is the same as (3.18). Thus we have decomposed the gradient system (3.18) into an E , F , and G. It is more interesting, however, to nd a dynamic inverse G such that if G were changed to the identity map, then the desired root would still be the solution to F (, t) = 0, but the resulting dynamic inverter would not converge to the desired root. For example, identifying F ( ) = M I , G[w ] = M T w , and E = 0 decomposes the gradient ow (3.14) into a dynamic inverter. For arbitrary M GL(n, R) the stability properties of = F ( ) are unknown. But with G dened as G[w ] = M T w , = G[F ( )] has an asymptotically stable equilibrium at = M 1 . For a system of the form (3.14) such a no general methodology for decomposition into E , F , and G. decomposition is straightforward. For more complicated gradient systems however, we have

3.3.2

Dynamic Inversion of Constant Matrices by a Prescribed Time


The constant matrix dynamic inverters (3.13) and (3.14) above have the potential

however, wish to obtain the inverse by a prescribed time. To obtain inversion by a prescribed

disadvantage of producing an exact inverse only asymptotically as t . One may,

time we now consider another method. If we could create a time-varying matrix H (t) that is invertible by inspection at t = 0, and that equals M at some known nite time t > 0, say t = 1, then perhaps we could use the technique of Section 3.2 for the inversion of time-varying matrices in order to invert H (t). If (0) = H (0)1 , then the solution of the dynamic inverter at time t = 1 will be M 1 . We require, of course, that H (t) remain in GL(n, R) as t goes from 0 to 1. One ideal candidate for the initial value of the time-varying matrix is the identity matrix I , since it is its own inverse. Example 3.3.2 Constant Matrix Inversion by a Prescribed Time Using Homotopy. Let M be a constant matrix in Rnn . We wish to dynamically determine the inverse

Sec. 3.3

Inversion of Constant Matrices

61

of M . Consider the t-dependent matrix,

Matrix Homotopy (3.23) H (t) = (1 t)I + tM.

matrices from the identity to M = H (1) as indicated in Figure 3.1; in fact this curve (3.23) is a straight line.

In the space of n n matrices, t H (t) describes a t-parameterized curve, or homotopy, of

R nxn H(t) [ 0 ] R 1 I M

Figure 3.1: The matrix homotopy H (t).

From Theorem 3.2.1 we know how to dynamically invert a time-varying matrix given that we have an approximation of its inverse at time t = 0. If we know the exact inverse at time t = 0, then we may use the dynamic inverter of Theorem 3.2.1 to track the exact inverse of the time-varying matrix for all t 0. In the present case the inverse at time t = 0 is just (t) = M I for the identity I . We may invert H (t) by substituting H (t) for A(t), and H

(t) in (3.1), setting (0) = I . Since our initial conditions are a precise inverse of H (0), A Theorem 3.2.1 tells us that the matrix becomes the precise inverse of M at time t = 1 as shown schematically in Figure 3.2. That is, of course, if H (t) remains nonsingular as t goes from 0 to 1!

62

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

R nxn M H(t) t=1

R nxn M-1 *(t) t=0 I Dynamic Inversion Solution

I Matrix Homotopy

Figure 3.2: The matrix homotopy H (t) from I to M with the corresponding solution (t), the inverse of H (t).

For a dynamic inverter for this example let F (, t) := ((1 t)I + tM ) I E ( ) := (M I )

G[w, ] := w

(3.24)

= G[F (, t), ] + E ( ) with (0) = I . Expanded, this is Then a dynamic inverter is Prescribed-Time Dynamic Inverter for Constant Matrices = ((1 t)I + tM ) I ) (M I ) Another choice of linear dynamic inverse is G[w, t] := ((1 t)I + tM )T giving Prescribed-Time Dynamic Inverter for Constant Matrices = H (t)T (H (t) I ) (M I )

Homotopy-based methods, also called continuation methods, for solving sets of linear and nonlinear equations have been around for quite some time. For a review of

Sec. 3.3

Inversion of Constant Matrices

63

developments prior to 1980 see Allgower and Georg [AG80] The general idea is that one starts with a problem with a known solution (e.g. the inverse of the identity matrix) and smoothly transforms that problem to a problem with an unknown solution, transforming the known solution in a corresponding manner until the unknown solution is reached. Often it is considerably easier to transform a known solution to a problem into an unknown solution to a closely related problem rather than calculating the new solution from scratch. Solution of the roots of nonlinear polynomial equations (see Dunyak et al. [DJW84] and Watson [Wat81] for examples) is a typical example with broad engineering application.

Remark 3.3.3 Requirement for Nonsingular Homotopy.

The scheme of Exam-

ple 3.3.2 requires that there is no t [0, 1] for which H (t) (3.23) is singular. Recall that there are two maximal connected open subsets which comprise GL(n, R), 0}. These two sets are disjoint and are separated by the variety of singular n n matrices {M Rnn | det(M ) = 0}. The identity I is in GL+ (n, R). In order for the curve t H (t) to be invertible, it must never leave GL+ (n, R) (see Figure 3.3). namely GL+ (n, R) = {M Rnn | det(M ) > 0} and GL (n, R) = {M Rnn | det(M ) <

GL+(n, R ) I H(t) M GL-(n, R ) Non-Invertible H(t) det = 0 I

GL+(n, R ) H(t) M

GL-(n, R ) Invertible H(t)

Figure 3.3: The homotopy from I to M must remain in GL+ (n, R) to be invertible. For our particular choice of H (t), since H (0) = I , and I is in GL+ (n, R), the [0, 1]. The following lemma species sucient conditions on M for H (t) (3.23) to remain in GL+ (n, R) as t goes from 0 to 1. homotopy H (t) should remain in GL+ (n, R) in order for it to be invertible for all t

64

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

Lemma 3.3.4 Matrix Homotopy Lemma. If M GL(n, R) has no eigenvalues in (, 0), then for each t [0, 1], H (t) = (1 t)I + tM is in GL(n, R). Remark 3.3.5 Inversion of Positive-Denite Symmetric Constant Matrices. If M is a positive denite symmetric matrix, then the assumption of Lemma 3.3.4 holds.

Remark 3.3.6 Subset Starlike about I.

Lemma 3.3.4 tells us that the subset of

GL(n, R) consisting of all M GL(n, R) such that (M ) (, 0) = is starlike about I , i.e. for each M in this subset, the straight line segment from I to M remains in the subset. [0, 1]. Proof of Lemma 3.3.4: Suppose that H (t) = (1 t)I + tM is singular for some t {0, 1}. Thus there exists a The identity I is nonsingular as is M by assumption, so t )I + ((1 t tM ) v = 0 to obtain Since t = 0 we can divide (3.25) by t 1) (t IM t v=0 (3.26) (3.25)

non-zero v Rn such that

can only satisfy (3.26) if (t ) := (t 1)/t is an eigenvalue of M . As t ranges over But t (0, 1), (t) ranges over (, 0). But by assumption M has no eigenvalues in (, 0), hence no such t exists in (0, 1) and H (t) is nonsingular on [0, 1]. We may obtain the exact inverse of M at any prescribed time t1 > 0 by a slight modication of the homotopy (3.23). We summarize our results of this section in the following theorem.

Theorem 3.3.7 Dynamic Inversion of Constant Matrices by a Prescribed Time. For any constant M GL(n, R), and for any prescribed t1 > 0, if (M ) (, 0) = ,

Sec. 3.3

Inversion of Constant Matrices

65

then the solution (t) of the dynamic inverter

Prescribed-Time Dynamic Inverter for Constant Matrices = (1 t t )I + M t1 t1 I (M I ) (3.27)

with (0) = I , satises (t1 ) = M 1 . Remark 3.3.8 Preservation of Symmetry. If M is symmetric, then the right-hand side of (3.27) is also symmetric. Thus if (0) is symmetric, then (t) is symmetric for all t. Example 3.3.9 Right and Left Inverses of Constant Matrices by a Prescribed The right inverse of A is given by AR := AT (AAT )1 . To obtain the right inverse AR at time t1 , we may apply Theorem 3.3.7 replacing M by AAT which is positive denite. Then AT (AAT )1 = AT (t1 ). Time. Let A Rmn be a constant matrix with m n and assume that A has full rank.

Prescribed-Time Dynamic Right-Inversion of a Constant Matrix = (1


t t1 )I

t T t1 AA

AT (t1 ) = AR

I (AAT I )

If a constant A has full column rank, then since AT A is positive denite, the left inverse AL := (AT A)1 AT may be obtained by substituting AT A for M in Theorem 3.3.7. Then AL = (t1 )AT .

Prescribed-Time Dynamic Left-Inversion of a Constant Matrix = (1


t t1 )I

(t1

t T t1 A A )AT =

AL

I (AT A I )

66

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3 Theorem 3.3.7 is limited in its utility by the necessity that M have a spectrum

which does not intersect (, 0). By appealing to the polar decomposition in Section 3.5 below, we will show that we may, at the cost of a slight increase in complexity, use dynamic inversion to produce an exact inverse of any invertible constant M , irrespective of its spectrum, by any prescribed time t1 0.

3.4

Polar Decomposition for Time-Varying Matrices


In this section we will show how dynamic inversion may be used to perform polar

decomposition [HJ85] and inversion of a time-varying matrix. We will assume that A(t) (t), and A(t)1 are bounded on (0, ). GL(n, R), and that A(t), A decomposition nds substantial utility in its own right. In particular it is used widely

Though polar decomposition will be used here largely as a path to inversion, polar

in the study of stress and strain in continuous media. See, for instance, Marsden and Hughes [MH83]. P U where U is in the space of n n orthogonal matrices with real entries, O(n, R), and P First consider the polar decomposition of a constant matrix M GL(n, R), M =

is the symmetric positive denite square root of M M T . Regarding M as a linear operator

GL(n, R), then P and U are unique.

(possibly with a reection) followed by a scaling along the eigenvectors of M M T . If M Now consider the case of a t-dependent nonsingular square matrix A(t). Since

Rn Rn , the polar decomposition expresses the action of M on a vector as a rotation

A(t) is nonsingular for all t 0, A(t)A(t)T is positive denite for all t 0. For any t 0, the unique positive denite solution to XA(t)A(t)T X I = 0 is X (t) = P (t)1 . Thus if we know X (t), then from A(t) = P (t)U (t) we can get the orthogonal factor U (t) of the polar decomposition by U (t) = X (t)A(t), as well as the symmetric positive denite part Since P (t) is a symmetric n n matrix, it is parameterized by s(n) := n(n + 1)/2

P (t) = X (t)A(t)A(t)T . We can also obtain the inverse of A(t) as A(t)1 = U (t)T X (t).

(3.28)

elements as is its inverse P 1 (t). We will construct the dynamic inverter that produces P 1 (t). Remark 3.4.1 Vector Notation for Symmetric Matrices. It will be convenient for the purposes of this section and the next to adopt a notation that allows us to switch between

Sec. 3.4

Polar Decomposition for Time-Varying Matrices

67

matrix representation and vector representation of elements of S(n, R). The convenience of this notation will be seen in Section 3.4.1 to arise from the lack of a convenient matrix form are in S(n, R). of the inverse of the linear matrix mapping on S(n, R), X XM + M X where X and M Choose an ordered basis = {1 , . . . , s(n)} correspondence is through the expansion of x in the ordered basis , x (x):= xi i S(n, R) (3.29)

for S(n, R). For any x Rs(n) there corresponds a unique matrix x S(n, R) where the (3.30)

is(n)

denote the vector of the expansion coecients of Conversely, for any X S(n, R), let X X=
is(n)

xi i

(3.31)

in the basis so that (X )= x X Then ) = X and ( (X x) = x (3.33) (3.32)

Let (t) := A(t)A(t)T Let F : Rs(n) R+ Rs(n) ; (x, t) F (x, t) be dened by F (x, t) := ( x(t) x I ) Let x be a solution of F (x, t) = 0. Then x is a symmetric square root of (t). Nothing in the form of F (x, t) (3.35) enforces the positive deniteness of the soluF (x, t) = 0, x (t) is also a solution. Each solution t x (t) is, however, isolated as long as D1 F (x , t), where F (x, t) is dened by (3.35), is nonsingular. We will show in the next D1 F (x , t). subsection, Subsection 3.4.1, that the nonsingularity of A(t) implies the nonsingularity of tion x (t), where x (t) is the solution of F (x, t) = 0. For instance, for each solution x (t) of (3.35) (3.34)

68

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

3.4.1

The Lyapunov Map


We will use a linear dynamic inverse for F (x, t) (3.35) based upon the matrix

inverse of D1 F (x , t). We will estimate this matrix inverse using dynamic inversion. It is not immediately obvious, however, that D1 F (x , t) is invertible. In this subsection we will consider the invertibility of D1 F (x , t). Dierentiate (x, t) = x F (t) xI with respect to x to get D1 F (x, t) = x (t) + (t) x The dierential D1 F (x, t) expressed as a mapping S(n, R) S(n, R) is L(t) x+x (t)Y x : Y L(t) x (Y ) := Y (t) (3.38) (3.37) (3.36)

s(n) The representation of L(t) in a basis x (Y ) on matrices Y expressed as vectors Y R . Thus the matrix D1 F (x, t) is invertible if and only if L(t) of S(n, R) is D1 F (x, t) Y x is

an invertible map. We will refer to a map of the form

LM : Y LM Y := Y M + M Y

(3.39)

with Y and M in Rnn as a Lyapunov map due to its relation to the Lyapunov equation Y M + M Y = Q which arises in the study of the stability of linear control systems (see e.g. Horn and Johnson [HJ91], Chapter 4). It may be easily veried that a Lyapunov map (3.39) is linear in Y . It may also be proven that LM is an invertible map if no two eigenvalues of M add up to zero (see e.g. [HJ91], Theorem 4.4.6, page 270). Now note that (t) x = x (t) = P (t) which is positive denite and symmetric, having only real-valued and strictly positive eigenvalues. Thus no pair of eigenvectors of (t) x sum to zero. Therefore L(t) x (Y ) is nonsingular. It follows then that the matrix D1 F (x , t) is invertible. Since D1 F (x, t) is continuous in x, it follows that D1 F (x, t) remains invertible for all x in a suciently small neighborhood of x . Though numerical inversion of the Lyapunov map has long been a topic of interest in the context of control theory [BS72, GNL79], we do not know of any matrix map L1 : an s(n) s(n) matrix, however, and representing elements of S(n, R) as vectors, the inverse matrix inversion or, as we will see, dynamic matrix inversion. This is why we sometimes resort to the vector notation of Remark 3.4.1 in referring to elements of S(n, R). L1 as a mapping between vector spaces Rs(n) Rs(n) can be obtained through standard S(n, R) S(n, R), taking matrices to matrices, which inverts LM . By converting LM to

Sec. 3.4

Polar Decomposition for Time-Varying Matrices

69

3.4.2

Dynamic Polar Decomposition


The estimator for D1 F (x , t)1 will be denoted Rs(n)s(n) , so that = D1 F (x , t)1 (3.40)

Using , we may dene a dynamic inverse for F (x, t). Let G : Rs(n) Rs(n)s(n) Rs(n) ; (w, ) G[w, ] be dened by G[w, ] := D1 F (x , t)1 long as is suciently close to . w= w (3.41)

for w Rs(n) . This makes G[w, ] (3.41) a dynamic inverse for F (x, t) = ( x(t) x I ), as To construct an estimator E (x, , t) Rs(n) of x , rst dierentiate F (x , t) = 0, D1 F (x , t)x + D2 F (x , t) = 0 and then solve for x , x = D1 F (x , t)1 D2 F (x , t) = D2 F (x , t) (3.43) (3.42)

t) Note that D2 F (x , t) = ( x ( x ). Now substitute x and for x and to obtain t) E (x, , t) := x ( x be dened by (3.44)

To obtain , let F : Rs(n) Rs(n)s(n) R+ Rs(n)s(n) ; (x, , t) F (x, , t) F (x, , t) := D1 F (x, t) I (3.45)

A linear dynamic inverse for F (x, , t) is G : Rs(n)s(n) Rs(n)s(n) Rs(n)s(n) ; (w, ) G [w, ] dened by G [w, ] := w (3.46)

, we dierentiate F (x , , t) = 0 with respect For an estimator E (x, , t) for , and substitute x and for x and respectively to get to t, solve for E (x, , t) := d D1 F (x, t) dt
x =E (x,,t)

(3.47)

Combining the E s, F s, and Gs from (3.44), (3.35), (3.41), (3.47), (3.45), and (3.46), we obtain the dynamic inverter x = G[F (x, t), ] + E (, x, t) = G [F (x, , t), ] + E (x, , t) (3.48)

70

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

or in an expanded form

Dynamic Polar Decomposition for Time-Varying Matrices x t) = ( x(t) x I ) x ( x d = (D1 F (x, t) I ) dt D1 F (x, t) x =E (x,,t) A(t)T ( x)2 A1 (t) x A(t)A(t)T P (t) x A(t) U (t)

(3.49)

and (0) D1 F ((P (0)1 ), 0)1. Under these conditions (t) P (t)1 for all t 0. gives the following theorem.

Initial conditions for the dynamic inverter (3.49) may be set so that x (0) P (0)1

Combining the results above with the dynamic inversion theorem, Theorem 2.3.5

Theorem 3.4.2 Dynamic Polar Decomposition of Time-Varying Matrices. Let A(t) be in GL(n, R) for all t R+ . Let the polar decomposition of A(t) be A(t) = P (t)U (t) with P (t) S(n, R) the positive denite symmetric square root of (t) := A(t)A(t)T , and

U (t) O(, R) for all t R+ . Let x be in Rs(n) , and let be in Rs(n)s(n) . Let (x(t), (t)) denote the solution of the dynamic inverter (3.49) where F (x, t) is given by (3.35). Then is suciently close to (P (0)1 , D1F ((P (0)1 ), t)1 ), then i. (t) x(t) exponentially converges to P (t), ii. x (t)A(t) exponentially converges to U (t), and iii. A(t) x(t)2 exponentially converges to A(t)1 . there exists a such that if the dynamic inversion gain satises > , and ( x(0), (0))

of Theorem 3.4.2.

An example of the polar decomposition of a 2 2 matrix will illustrate application

Example 3.4.3 Polar Decomposition of a Time-Varying Matrix. Let A(t) := 10 + sin(10t) cos(t) t 1 (3.50)

Sec. 3.4

Polar Decomposition for Time-Varying Matrices

71

In this case x R3 and R33 . We will perform polar decomposition and inversion of

square root of A(t)A(t)T , and with U (t) O(2, R). Let (t) = 1 2 2 3

U (t) such that A(t) = P (t)U (t), with P (t) S(2, R) being the positive denite symmetric

A(t) over t [0, 8], an interval over which A(t) is nonsingular. We will estimate P (t) and

= A(t)A(t)T

(3.51)

We choose the ordered basis of S(2, R) to be 1 0 0 0 0 1 1 0 0 0 0 1

(3.52)

In this basis we have


2 1 x2 1 + 22 x1 x2 + 3 x2 1

2 F (x, t) = ( x(t) x) = x x + x + x x + x x 1 1 2 2 2 1 3 3 2 3 2 21 1 x2 + 2 x x + x 2 2 3 3 2 3 Then 2(1x1 + 2 x2 ) 1 x2 + 2 x3 0 2(2 x1 + 3 x2 ) 1 x1 + 22 x2 + 3 x3 2(1 x2 + 2 x3 ) 0 2 x1 + 3 x2 2(2 x2 + 3 x3 )

(3.53)

D1 F (x, t) =

(3.54)

For an estimator for x we have from (3.44) 1 x2 + 2 2x1 x2 + 3 x2 1 2 2 E (x, , t) = 1 x1 x2 + 2 x2 + 2 x1 x3 + 3 x2 x3 1 x2 + 2 2x2 x3 + 3 x2


2 3

(3.55)

The estimator E for is given by (3.47), where 11 L 12 0 L d 21 L 22 L 23 D1 F (x, t) = L dt x =E (x,,t) 0 L32 L33

(3.56)

72 with

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

1 x1 + 21 E1 (x, , t) + 2 2 x2 + 22 E2 (x, , t) 11 = 2 L 12 = 2L 23 L 1 x2 + 1 E2 (x, , t) + 2 x3 + 2 E3 (x, , t) 21 = L 22 = 1 x1 + 1 E1 (x, , t) + 2 2 x2 + 22 E2 (x, , t) + 3 x3 + 3 E3 (x, , t) L 2 x1 + 2 E1 (x, , t) + 3 x2 + 3 E2 (x, , t) 23 = L 32 = 2L 21 L 33 = 2 2 x2 + 22 E2 (x, , t) + 2 3 x3 + 23 E3 (x, , t) L Dynamic inversion using equations (3.49) was simulated using the adaptive step size Runge-Kutta integrator ode45 from Matlab, with the default tolerance of 106 . The initial conditions were set so that x (0) = (0)1/2 + e x (0) = D1 F (x(0), t)1 (3.58) (3.57)

the error transient of the dynamic inverter. The value of was set to 10.

where ex = [0.55, 0.04, 2.48]T is an error that has been deliberately added to demonstrate The graph of Figure 3.4 shows the values of the individual elements of A(t). The

top graph of Figure 3.5 shows the elements of x(t), the estimator for P (t)1 , and the bottom graph of Figure 3.5 shows the elements of (t). estimator for P (t)1 fails to be the square root of (t) = A(t)A(t)T . For estimates of P (t), U (t), and A(t)1 we have (t)A(t)A(t)T P (t), (t)A(t) U (t), and A(t)T 2 A(t)1 (3.59) Figure 3.6 shows log10 ( x (t)(t) x(t) I
)

indicating the extent to which x , the

Sec. 3.4

Polar Decomposition for Time-Varying Matrices

73

A(t)
12 10 8 6 4 2 0 -2 -4 -6 -8

Figure 3.4: Elements of A(t) (see (3.50)). See Example 3.4.3

x(t)
0.5 0 -0.5 -1 -1.5

t (t)

0.2 0 -0.2 -0.4 -0.6 -0.8 0 1 2 3 4 5 6 7 8

Figure 3.5: Elements of x (top), and (bottom). See Example 3.4.3.

74

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

Log10 of Error in Estimation of P(t)-1


2 1 0 -1 -2 -3 -4 -5 -6 -7

Figure 3.6: The error log10 ( x (t)(t) x(t) I ) indicating the extent to which x fails to satisfy x (t) x I = 0. The ripple from t 1.8 to t = 8 is due to numerical noise. See Example 3.4.3.

Sec. 3.5

Polar Decomposition and Inversion of Constant Matrices

75

Remark 3.4.4 Symmetry of the Dynamic Inverter. It is interesting to note that P (t)1 , besides being a solution to x (t) x I = 0 is also a solution to (t) x2 I = 0 as well as x 2 (t)I = 0. But (t) x2 I and x 2 (t)I are not, in general, symmetric even when (t) and x are symmetric. Though exponential convergence is still guaranteed when using these forms, the ow (t) is not, in general, conned to S(n, R). Using these forms would increase the number of equations in the dynamic inverter by n(n 1)/2 + n2 s(n)2 since, not only would the right hand side of the top equation of (3.49) no longer be symmetric, but would be n2 n2 rather than s(n) s(n).

3.5

Polar Decomposition and Inversion of Constant Matrices


In the dynamic inversion techniques of Sections 3.2 and 3.4 we assumed that we

had available an approximation of A1 (0) with which to set (0) in the dynamic inversion of A(t). Thus we would need to invert at least one constant matrix, A(0), in order to start the dynamic inverter. Methods of constant matrix inversion presented in Section 3.3 had the or of only working on matrices with no eigenvalues in the interval (, 0). The question potential disadvantage of either producing exact inversion only asymptotically as t ,

naturally arises then, how might we use dynamic inversion to invert any constant matrix so that the exact inverse is available by a prescribed time. In this section, by appealing to both homotopy and polar decomposition, we give an answer to this question. Let M be in GL(n, R) with P = P T > 0, U U T = I, and M = P U (3.60)

Helmke and Moore (see [HM94], pages 150-152) have described a gradient ow for the function A U P
2

, = U P M T U MP U = 2P + MT U +U TM P (3.61)

and U are meant to approximate P and U respectively. Asymptotically, this where P system produces factors P and U satisfying M P U = 0 for almost all initial conditions (0), U (0) as t . A diculty with this approach, as the authors point out, is that P is not guaranteed. positive deniteness of the approximator P In this section we describe a dynamic system that provides polar decomposition of any nonsingular constant matrix by any prescribed time, with the positive deniteness

76

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

of the estimator of P guaranteed. This will be accomplished by applying Theorem 3.4.2 on dynamic polar decomposition of time-varying matrices to the homotopy (t) := (1 t)I + tM M T (3.62)

Unlike the homotopy H (t) = (1 t)I + tM of Section 3.3, the homotopy (t) (3.62) is positive denite symmetric matrix for all t [0, 1]. The situation is depicted in Figure 3.7. guaranteed to have a spectrum which avoids (, 0) for any nonsingular M since (t) is a

S(n, R ) > 0 MM T (t)

I
Figure 3.7: (t) is positive denite and symmetric for all t [0, 1]. Recall that M is in GL(n, R). For (t) as dened in (3.62) note that (0) = I , denote the positive denite symmetric square root of (t). Let the estimator of P 1 (t) be x Rnn . Dierentiate (t) (3.62) with respect to t to get t) = M M T I ( (3.63) (1) = M M T , and for all t [0, 1], (t) is positive denite and symmetric. Let P (t)

Now we may apply the dynamic inverter of Section 3.4 in order to perform the polar decomposition of M . As in (3.35), let F (x, t) := ( x(t) x I ) (3.64)

By inspection it may be veried that x (0) = I and (0) = I . If we set x (0) = I and (0) = I , then Theorem 2.3.5 and the results of the last section assure us that x (t) P 1

Sec. 3.5

Polar Decomposition and Inversion of Constant Matrices

77

for all t 0, and thus x (1) = P 1 . Consequently x (1) = P 1 (1) x(1) = M M T x (1) = P x (1)M = U MT x (1)2 = M 1 t) = M M T I = 0 if and only if M is unitary, in which case Note that ( Combining the results of this section with the results of the last section gives the following Theorem. Theorem 3.5.1 Dynamic Polar Decomposition of Constant Matrices by a Prescribed Time. Let M be in GL(n, R). Let the polar decomposition of M be M = P U with P S(n, R) the positive denite symmetric square root of M M T and U O(n, R). Let x and (0) = 1 I . Let (x(t), (t)) denote be in Rs(n) , and let be in Rs(n)s(n) . Let x(0) = I
2

(3.65)

M 1 = M T .

the solution of

Prescribed-Time Dynamic Inverter for Constant Matrices = G [F (, x)] + E (x, ) (t) = (1 t)I + tM M T F (x, t) = ( x(t) x I ) G[w, ] = w

x = G[F (x, t), ] + E (x, )

(3.66)

E (x, ) = ( x(M M T I ) x) F (x, , t) = D1 F (x, t) I G [w, ] = w


d dt D1 F (x, t)

E (x, ) =

x =E (x, )

Then for any > 0, MMT x (1) = P, x (1)M = U, and M T x (1)2 = M 1 (3.67)

78

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

Remark 3.5.2 Polar Decomposition by Any Prescribed Time. As in Theorem 3.3.7 we can force x to equal P 1 at any time t1 > 0 by substituting t/t1 for t in (t), and proceeding with the derivation of the dynamic inverter as above. Then x (t1 ) = P 1 . Example 3.5.3 A digital computer simulation of a dynamic inverter for the polar decomposition of a constant 2-by-2 matrix was performed. The integration was performed in Matlab [Mat92] using ode45 an adaptive step size Runge-Kutta routine using the default tolerance of 106 . The matrix M was chosen randomly to be M= 7 3 (3.68)

24 3

The value of was set to 10. The evolution of the elements of x(t) and (t) are shown in Figure 3.8.

x(t)
1 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

t (t)

0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 3.8: Elements of x(t) (top) and (t) (bottom), for Example 3.5.3. Figure 3.9 shows the base 10 log of F (x, t)

= x (t)M M T x (t) I

indicating

Sec. 3.6

Chapter Summary

79

the extent to which x, the estimator for P 1 fails to be the square root of (t) = M M T .

Log10 of Error in Estimation of Inverse ofP(t)


3 2 1 0 -1 -2 -3 -4 -5 -6 -7

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 3.9: The base 10 log of the error x (t)M M T x (t) I The nal value (t = 1) of the error x (t)M M T x (t) I x (1)(1) x(1) I Final values of P , U , and A1 were P = MMT x (1) = 5.2444 5.5223 0.3473

for Example 3.5.3. was (3.69)

= 1.0611 106

5.5223 23.5479

U =x (1)M =

0.9377 0.3473 0.0323

0.9377

(3.70)

M 1 = M T x (1)2 =

0.2581 0.0753

0.0323

80

Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3

3.6

Chapter Summary
We have seen how the polar decomposition and inversion of time-varying and

constant matrices may be accomplished by continuous-time dynamic systems. Our results are easily modied to provide solutions for time-varying and time-invariant linear equations of the form A(t)x = b. We have also seen that dynamic inversion in the matrix context provides a useful and general conceptual framework through which to view other methods of dynamic computation such as gradient ow methods. In some control problems, dynamic inversion may provide essential signals which can be incorporated into controllers for nonlinear dynamic systems [GM95c]. In those same problems it may also be used for matrix inversion. For example, dynamic inversion will be incorporated into a controller for robotic manipulators in Chapter 5 where the dynamic inverter will produce inverse kinematic solutions necessary for the control law. If inversion of, say, a time-varying mass matrix is also required in the same problem, a dynamic inverter may be augmented to provide that capability too, without interfering with other inversions within the same problem.

81

Chapter 4

Tracking Implicit Trajectories


4.1 Introduction
In this chapter we consider the problem of controlling the output of a time-invariant nonlinear control system to track a given implicitly dened reference trajectory. By an implicitly dened reference trajectory is meant a reference trajectory (t) dened as a particular continuous isolated solution to an equation of the form F (, t) = 0. A dynamic inverter (see Chapter 2) will be incorporated into a tracking controller in order to control the output of a nonlinear control system to track such an implicit trajectory. A standard output-tracking controller for a given nonlinear time-invariant plant having vector relative degree (see Appendix C) relies upon explicit expressions for both an output reference trajectory, as well as the time-derivatives of the output reference trajectory. Given the reference trajectory and its derivatives, exponentially convergent tracking can be guaranteed by feedback linearization (see Section C of Appendix C) followed by standard tracking control for integrator chains (reviewed in Section B.4 of Appendix B). For a simple example, consider the system x = f (x) + g (x)w, y = x (4.1)

with input w , output y , state x, and where x, f (x), g (x), w , and y are in R, and f (0) = 0. Assume g (x) = 0 for all x in a neighborhood of 0. We feedback-linearize system (4.1) by setting w= in (4.1) to get x =u (4.3) 1 (f (x) + u) g (x) (4.2)

82

Tracking Implicit Trajectories

Chap. 4

Thus the relationship between the new input u and the output y is through the linear equation y =u Then, to make the output y (t) track a reference trajectory yd (t), we set u=y d (x yd ) with R, > 0. Insert this u (4.5) into (4.3) to get x =y d (x yd ) nentially1. This can be seen by letting e := x yd in which case (4.6) takes the form e = e (4.7) (4.6) (4.5) (4.4)

The control (4.5) causes the output y (t) to converge to the reference trajectory yd (t) expo-

which is an autonomous linear dynamic system with exponentially stable equilibrium, e = 0. Thus e 0 as t . Since e = y yd , this implies that y (t) yd (t) exponentially a t . Now suppose we substitute for the explicit reference trajectory yd (t) and its deriva-

tives, estimators for an implicitly dened reference trajectory (t) and its derivatives, obtained through dynamic inversion. It seems reasonable to expect that the combination of dynamic inverter and controlled plant2 will display error dynamics that are stable at least asymptotically. We will prove that this reasonable expectation is indeed correct, and that in fact we can achieve exponentially stable output tracking-error dynamics.

4.1.1

Motivation
The implicit tracking problem has been motivated in part by the problem of con-

trolling robotic manipulators to track inverse-kinematic solutions, an application which is reviewed and explored in the robotic manipulator control context in Chapter 5. In the present chapter we study an implicit tracking problem, dened precisely in Section 4.2, that is more general for two reasons: i. We consider the implicit output reference trajectory (t), to be the solution of an equation of the general form F (, t) = 0 rather than the form F () x(t) = 0 commonly used in the inverse-kinematics problem. We allow the possibility that the

1 By exponential convergence of q(t) Rn to r(t) Rn we mean that there exist two positive real constants k1 and k2 such that q(t) r(t) k1 q(0) r(0) ek2 t . 2 We use the standard term plant to refer to the dynamic system which we wish to control.

Sec. 4.1

Introduction

83

t-dependence of F (, t) may arise through dependence of F on the solution of an exogenous dynamic system. ii. We consider that there may be performance limitations due to the possibility of unbounded internal dynamics (dened precisely in Denition 4.2.3 below). Since we will use dynamic inversion in our controller, a primary concern is that any transients induced by the coupling of dynamic inverter to the plant do not cause the internal state of the resulting closed-loop system to become unbounded. Another motivation for study of the implicit tracking problem has been our use of implicit tracking in the control of nonlinear nonminimum-phase systems as we will detail in Chapter 6. That application will, in fact, include a slight variation in the structure of the derivative estimators.

4.1.2

Previous Work
To the best of our knowledge, previous work on the output tracking of implicitly

dened reference trajectories has been conned to robotics-related work on implicit trajectories satisfying F () x(t) = 0. Thus we reserve a discussion of such previous work for the next chapter where the robot control problem is discussed in some detail.

4.1.3

Main Results
The main results of this chapter are:

i. a useful characterization of acceptable internal behavior for nonlinear control systems called output-bounded internal dynamics, Denition 4.2.7, in which a bound on the output and its derivatives with respect to time implies bounded internal state, ii. an algorithm, Algorithm 4.3.1, for constructing derivative estimators for time derivatives of the implicitly dened reference trajectory, iii. a dynamic tracking controller that causes nonlinear control systems with outputbounded internal dynamics to track implicitly dened reference trajectories, iv. an implicit tracking theorem (Theorem 4.3.4) describing conditions under which the implicit tracking controller guarantees exponential output tracking convergence with bounded internal state.

84

Tracking Implicit Trajectories

Chap. 4

4.1.4

Chapter Overview
In Section 4.2 we give a precise denition of the implicit asymptotic tracking

problem, and dene the class of control systems in which we will be interested. In Section 4.3 we construct a dynamic compensator which (a) produces an explicit estimator for an implicit reference trajectory, and (b) causes the output of a nonlinear plant to converge exponentially to a given implicit reference trajectory. In Section 4.4 we give an example of implicit tracking control for a system that has output-bounded internal dynamics, but unstable zero dynamics. A simulation will illustrate tracking convergence with bounded internal state.

4.2

Problem Denition
In this section, after supplying the necessary assumptions, denitions, and math-

ematical setting, we precisely dene the implicit tracking problem.

4.2.1

System Structure
We will consider nonlinear time-invariant control systems of the form x = f (x) + g (x)w y = h(x) (4.8)

with w and y in Rp , and x Rn . Assumption 4.2.1 Assume that f : Rn Rn , g : Rn Rnp , and h : Rn Rp are suciently smooth in x and t. Assume also that f (0) = 0, and h(0) = 0. Assumption 4.2.2 Vector Relative Degree. Assume that system (4.8) has well-dened vector relative degree3 r = [ r 1 , r 2 , . . . , r p] in a neighborhood of the origin x = 0. By Assumption 4.2.2, ri is the minimum number of times one must dierentiate output component yi in order to see any component wk , k p, of the input w .
3

(4.9)

See Section C of Appendix C for a review of vector relative degree.

Sec. 4.2

Problem Denition

85

ext

int

Figure 4.1: Schematic of (4.11).

Let r := max{ri }.
ip

(4.10)

Then r is the highest relative degree of any of the outputs yi of system (4.8), i.e. the some component wk , k p of the input w . maximum number of times one must dierentiate any output yi (t), i p, in order to see Through standard state-dependent coordinate and input transformations (see Section C of the Appendix) we may input/output linearize the plant (4.8) so that it takes the form Plant ext : j = j +1 , i p, j ri 1 i i ri i = ui , i p

(4.11)

int : = (, ) + (, )u y = 1, i p i i

of (4.11) is illustrated in Figure 4.1.

j with i R, and ui R. It follows from Assumption 4.2.1 that (0, 0) = 0. The structure

86

Tracking Implicit Trajectories

Chap. 4

4.2.2

Internal Dynamics
Let sr := r1 + + rp (4.12)

be the sum of the relative degrees of the p outputs yi , i p. Let


r1 1 r2 1 2 := 1 , 1 , . . ., 1 , 2 , . . . , 2 , . . . , p p r T

Rsr .

(4.13)

It follows that is in Rnsr . Denition 4.2.3 We refer to the dynamics of , Internal Dynamics of P (4.14) int : = (, ) + (, )u obtained from (4.11), with regarded as an exogenous time-dependent function, as the internal dynamics of (4.11). We refer to as the internal state. Denition 4.2.4 We refer to the dynamics of , obtained from (4.14) by setting 0 and u 0, as the zero dynamics of (4.11),

Zero Dynamics of P (4.15) = (0, )

As pointed out by Isidori and Moog [IM91], other useful denitions of zero dynamics for square multi-input, multi-output nonlinear systems are possible. However, those denitions are equivalent to Denition 4.2.4 for the case of decoupled systems such as (4.11). Denition 4.2.5 If the zero dynamics of (4.11) are asymptotically stable at = 0, then (4.11) is a minimum-phase system4. Otherwise (4.11) is a nonminimum-phase system.

4 If a transfer function H (s) of a linear system has a zero in the right half of the complex plane, the transfer function, when evaluated for s going along the imaginary axis from j to +j , undergoes a change in phase which is greater, for the same magnitude, than if that zero were replaced by its left half plane mirror image; hence the name nonminimum phase [FPEN86].

Sec. 4.2

Problem Denition

87

4.2.3

The Output Space


n [0, ) denote the normed space of n-times continuously dierentiable Rp Let Cp

n [0, ), let the norm valued functions on [0, ). For y () Cp

(n)

be dened by

y ()

(n)

:= sup{ y (t)
t0

y (1)(t)

, . . . ,

y (n)(t)

(4.16)

n [0, ), using the norm (4.16), is The open r -ball in Cp (n) n Br := {y () Cp [0, )| y () (n) (n)

< r }.

(4.17)

Note that if y () is in Br , then for each t 0, y (i)(t) Br Rp , i p, the r -ball in Rp. dene

r [0, ) (see Equation (4.10) for the denition of r For any particular y () Cp ), (1) (r11) (r21) (rp 1) T

Y (t) := y1 (t), y1 (t), . . . , y1

(t), y2 (t), . . ., y2
( r1 )

(t), . . . , yp ]

(t)

Rsr

(4.18) (4.19)

y (r) (t) := [y1 () (4.13).

, . . . , yp

( rp ) T

Note that due to the structure of P (4.11), if y () is the output of P (4.11), then Y () Let (Y, u) := [y1 , . . . , y1 Note that (Y, u) = (Y, y (r)). Since yi y Thus y and y
( r) ( r) ( r) t0 ( ri ) (0) (r11)

, y2 , . . . , yp

(0)

(rp 1)

, u1 , . . . , up]T

(4.20)

= ui , i p ,
, . . . , r) y ( (t) ,

= sup{ y (0)(t)

u(t)

(4.21)

< sup (Y (t), u(t))


t0

<

(4.22)

< = Y (t)

<

t 0

(4.23)

Assumption 4.2.6 Bounded Implicit Output-Reference Trajectory. Assume that the reference output that we wish to track is a particular continuous isolated solution (t) Rp of F (, t) = 0, (4.24)

+2 in and t, and satises assumptions i, ii, where F : Rp R+ Rp ; (, t) F (, t) is C r

and iii of the dynamic inverse existence lemma, Lemma 2.2.11. Assume, furthermore, that ()
( r)

<

(4.25)

for some constant > 0.

88

Tracking Implicit Trajectories

Chap. 4

4.2.4

Ouput-Bounded Internal Dynamics


It is useful to dene a class of nonlinear control systems having the property that

if the output y (t) and its derivatives up to some nite order are bounded5 to be suciently small, then the internal state of the system is guaranteed to be bounded. If a control system meets such a criterion then one may essentially ignore the internal dynamics if it can be guaranteed that, in a suitable norm, the output and its derivatives are suciently small. Minimum-phase systems (see Denition 4.2.5) are such a class, but the class of minimum-phase systems is only a subclass of the class of systems that possess the desired property. We introduce in the present subsection a property called output-bounded internal dynamics. Informally, a system with output-bounded internal dynamics is one in which the internal state is bounded whenever the output and its derivatives up to some nite order are bounded to be suciently small. However, output-bounded internal dynamics does not imply that the bound on the internal state goes to zero as the bound on the output and its derivatives goes to zero. Thus, for instance, smooth control systems with stable zero dynamics (see Denition 4.2.4) also have output-bounded internal dynamics, but systems with output-bounded internal dynamics may have unstable zero dynamics. Consider as an example of a system with output-bounded internal dynamics (and unstable zero dynamics) a planar cart upon which is xed a bowl in which a ball is free to roll. Assume that no energy is dissipated by the interaction of the rolling ball and the bowl, and that the mass of the ball is a point mass located at the center of the ball.

-1

Figure 4.2: The cart and ball system.


5

Whenever we say that a signal or state is bounded we will mean that its norm (4.16) is bounded above.

Sec. 4.2

Problem Denition

89

The position y (t) of the cart is the output of the cart-ball system, and the acceleration6 u of the cart is the input. The position and velocity of the ball comprise the internal state of the cart-ball system. Assume as indicated in Figure 4.2 that the bowl is such that its center is higher than the region immediately surrounding the center, but that to either side of the center there is a relative minimum with respect to height. Figure 4.3 shows the three resulting equilibria: the unstable one at the center of the bowl, and the stable ones to the left and right of center.

-1

-1

-1

Figure 4.3: Three equilibria of the cart and ball.

Objective: We would like to cause the cart to asymptotically track desired output reference trajectories yd (t), while allowing the ball to remain in the bowl. The zero dynamics of the cart-ball system are the dynamics of the ball in the bowl when the cart is held still, i.e. u 0 and y 0. Clearly the ball of Figures 4.2 and 4.3 is unstable at the origin (center bottom of bowl), hence the zero dynamics of the cart-ball system are unstable. If, however, y (t) and its derivatives up to order 2 are kept suciently small, then we may guarantee that the ball never leaves the bowl7 . This is what we mean by output-bounded internal dynamics.

Choosing the acceleration of the cart as the input rather than the force on the cart is equivalent to input-output linearization of the cart-ball system. 7 In fact, due to the assumption that the cart rolls along at ground, as indicated in Figure 4.2, we need only restrict y to be small, but for generality in our denition of output-bounded internal dynamics we ignore this particular property.

90

Tracking Implicit Trajectories

Chap. 4

-1

-1

-1

-1

-1

-1

-1

Figure 4.4: If (y (t), y (t), y (t)) is kept suciently small for all t 0, then the ball remains in the bowl. Now, regarding the internal dynamics of a class of systems of the form (4.11), we make the following formal denition (see Figure 4.5):

R sr+p (0)

R n-sr

(Y(0),u(0)) (Y(t),u(t)) 0

0 (t)

Figure 4.5: Output-bounded internal dynamics.

Output-Bounded Internal Dynamics


Denition 4.2.7 A system of the form (4.11) has output-bounded internal dynamics if there exist real numbers > 0 and > 0 such that if y () and (0) < , then (t) < for all t 0.
( r)

<

Figure 4.6 shows some cart-ball systems that do not have output-bounded internal dynamics. Figure 4.7 shows some more cart-ball systems that do have output-bounded zero

Sec. 4.2 dynamics. Note the following:

Problem Denition

91

The cart-ball system i in Figure 4.7 is obtained from the cart-ball system c in Figure 4.6 by a shift of internal coordinates.

All cart-ball systems in Figure 4.6 have unstable zero dynamics.

Cart-ball systems k and l of Figure 4.7 have unstable zero dynamics, while carts g, h, i, and j have stable zero dynamics.

-1

0 a

-1

0 b

-1

0 c

-1

0 d

-1

0 e

-1

0 f

Figure 4.6: Some cart-ball systems that do not have output-bounded internal dynamics.

92

Tracking Implicit Trajectories

Chap. 4

-1

0 g

-1

0 h

-1 i

-1

0 j

-1

0 k

-1

0 l

Figure 4.7: Some cart-ball systems that do have output-bounded internal dynamics. Remark 4.2.8 Control of carts a, c, and d of Figure 4.6 represents a subclass of nonminimum-phase control problems, including the inverted pendulum on a cart and the controlled bicycle, that we will consider in some depth in Chapter 6. Assumption 4.2.9 Output-Bounded Internal Dynamics. Assume that the control system (4.11) has output-bounded internal dynamics. Remark 4.2.10 Unstable Zero-Dynamics. Under Assumption 4.2.9, system (4.11) may have unstable zero dynamics, i.e. even if (4.11) has output-bounded internal dynamics, the origin of the system = (0, ) may be unstable. An example will illustrate Remark 4.2.10. Example 4.2.11 Output-Bounded Internal Dynamics. Consider the plant 1 = 2 2 = u = ( + 3)( + 1) ( 1)( 3) + 1 + ( 2 )2 + u y = 1

(4.26)

Sec. 4.2

Problem Denition
Vector Field of Zero Dynamics
40

93

30

20

10

()

-10

-20

-30

-40

-3

-2

-1

Figure 4.8: The zero dynamics vector eld ( ) for the zero dynamics (4.27) of Example 4.2.11. The origin of the zero dynamics is unstable, but (t) is bounded on [0, ) when | (0)| < 3. which is of the form (4.11). The zero dynamics of (4.26) are obtained by setting y 0 and is (see also (4.15))

u 0. Setting y 0 implies 1 = 0 and 2 = 0. Thus the zero dynamics of system (4.26)

= ( + 3)( + 1) ( 1)( 3) = (0, ) =: ( ).

(4.27)

The vector eld ( ) (4.27) for the zero dynamics is graphed in Figure 4.8. It has equilibria the the observation that the slope of versus ( ) is positive at = 0. Note that for all (0) < 3, the solution (t) of the zero dynamics (4.27) satises (t) < 3 for all t 0. We claim that if (0) y ()
(2)

at {3, 1, 0, 1, 3}. The zero dynamics is clearly unstable at = 0 as evidenced by

< 2 (corresponding to = 2 in Denition 4.2.7), and

< 4 (corresponding to = 5 in Denition 4.2.7), then the solution (t) of = ( + 3)( + 1) ( 1)( 3) + 1 + ( 2 )2 + u (4.28)

satises (t) < 2 for all t 0. To wit: take as a Lyapunov function candidate for the internal dynamics (4.28) V ( ) := 1 2 . 2 (4.29)

94

Tracking Implicit Trajectories

Chap. 4

Dierentiate V ( ) with respect to t to get d V ( ) = = ( + 3)( + 1) 2( 1)( 3) + ( 1 + ( 2 )2 + u). dt (4.30)

Note that 1 = y , 2 = y , and u = y . Consider the interval [2, 2] = B2 for ; in particular consider the endpoints of this interval. It may be easily veried by substituting 2 and 2 for in (4.30) that if

1 (t) + ( 2 (t))2 + u = y (t) + y (t)2 + y < 30, for all t 0, then


d dt V (2)

(4.31)

< 0. It follows that if y ()


(2)

< 4,

(4.32)

then (t) remains in [2, 2] for all t 0. Therefore the plant (4.26) has output-bounded internal dynamics about the equilibrium (, ) = (0, 0).

4.2.5

The Problem
We can now dene the problem central to this chapter.

isolated

Problem 4.2.12 Asymptotic Implicit Tracking Problem. Let () B


r [0, ) Cp

( r)

be an

solution of F (, t) = 0. Given the control system P (4.11) having output-

bounded internal dynamics, nd a compensator C such that for the closed-loop system [C, P ] dened by Figure 4.9, i. y (t) (t) asymptotically, ii. the internal state of [C, P ] is bounded. In fact the compensator C for tracking implicit trajectories will be dynamic, its dynamic part being a dynamic inverter with state (, ) as indicated in Figure 4.9 (The state is the part of the dynamic inverter state from which the dynamic inverse is formed as discussed in Chapter 2, Section 2.4). The internal state of the closed loop system [C, P ], i.e. the unobservable state, is (, , ). Thus if item ii of Problem 4.2.12 is satised, then (t) is bounded on [0, ).

Sec. 4.3

Tracking Control

95

u C
, ,

y P

Figure 4.9: The closed-loop control system [C, P ].

4.3

Tracking Control
In this section we will construct the implicit tracking controller and prove that

it provides exponentially convergent tracking with bounded internal dynamics. In Subsection 4.3.1 we will review tracking control for explicit trajectories for systems of the form (4.11). In Subsection 4.3.2 we will apply dynamic inversion to obtain an estimator (t) for an implicit reference output (t). Since we will, in general, require estimates of higher time-derivatives of the reference output (t), in Subsection 4.3.3 we give an algorithm for obtaining derivative estimators dependent upon the state of a dynamic inverter. Then in Subsection 4.3.4 we will join the dynamic inverter and the derivative estimators to the plant P (4.11) to create the closed loop system [C, P ]. In Subsection 4.3.5 we will prove that the resulting system [C, P ] provides exponentially convergent tracking with bounded internal dynamics.

4.3.1

Tracking Explicit Trajectories


r [0, ), let For a desired reference trajectory yd (t) = [yd1 (t), . . ., ydp (t)]T Cp

Yd (t) := [yd1 (t), . . . , yd11 and

(0)

(r 1)

(t), yd2 (t), . . . , yd22


(r )

(0)

(r 1)

(t), yd3 (t), . . ., ydpp

(0)

(r 1)

(t)]T

(4.33)

yd (t) := [yd11 (t), . . . , ydpp (t)]T Rp


( r) ( r)

( r)

(r )

(4.34)

i p, j ri be chosen to be real constant coecients of the polynomials in s,


ri

j and assume that yd () is in B , equivalently (Yd (t), yd (t)) B for all t 0. Let {i },

sri +
j =1

i sj 1 , i p,

(4.35)

such that all roots of the polynomials have strictly negative real parts. It is a standard and elementary result of linear control theory (see Section B.4 of Appendix B) that the choice

96 of input

Tracking Implicit Trajectories

Chap. 4

ui =

(r ) ydi i (t)

ri

k=1

k k i (i ydi

(k1)

), i p

(4.36)

will cause (t) to converge to Yd (t) with exponentially decaying error. It follows from the form of (4.36) that ( (t), u(t)) (Yd (t), yd (t)) exponentially as t . If and , with with the assumption that yd is in B 0 < , are suciently small, then the exponential convergence of (t) to Yd (t) together
( r) ( r)

Figure 4.10). This, combined with Assumption 4.2.9 guarantees that (t) remains bounded.
R sr+p

implies that (t) remains in B for all t 0 (see

R n-sr ((0),u(0)) (Yd(0),yd(0)) (0) 0 (t)

Figure 4.10: If ( (0), u(0)) < and (Yd (t), yd ) < with and suciently small, ( r) then convergence of ( (t), u(t)) to (Yd (t), yd (t)) preserves the upper bound on the internal state (t). In the present case (t) is the desired output reference trajectory we would like to track. Of course if we had an explicit expression for (t) we could simply substitute (t) and its derivatives, for yd (t) and its derivatives in (4.36). We assume, however, that such an explicit expression for (t) may not be available.

( r)

4.3.2

Estimating the Implicit Reference Trajectory


The implicit reference trajectory (t) may be estimated by using a dynamic in-

verter satisfying the assumptions of Theorem 2.3.5 on dynamic inversion with vanishing error. In this chapter we will use a dynamic inverse of the form (2.110) which we assume

Sec. 4.3

Tracking Control

97

satises the assumptions of Theorem 2.4.6. This will allow us to determine a dynamic inverse dynamically. For convenience we repeat that dynamic inverter here: 0 0 D1 F (, t) I F (, t) (4.37)

= +

d = D2 F (,t) dt D1 F (, t)

D2 F (, t)

For notational simplicity, dene 0 0 D1 F (, t) I F (, t) F (, , t) F (, t)

[w ] := G

(, t) := w, F

(4.38)

and (, , t) := E
d = D2 F (,t) dt D1 F (, t)

D2 F (, t)

E (, , t) E (, , t)

(4.39)

so that the dynamic inverter (4.37) is represented by

F (, , t), , + E (, , t) = G

(4.40)

(, , t) = 0, with where ( (t), (t)) is dened to be a continuous isolated solution of F ( , t)1 . By Theorem 2.4.6, ( (t), (t)) converges to ( (t), (t)) exponen (t) D1 F tially for suciently large > 0 if ( (0), (0)) is suciently close to ( (0), (0)).

4.3.3

Estimating Derivatives of Implicit Trajectories


We will substitute estimators for time derivatives of (t) into the tracking law (4.36)

in place of the exact time derivatives of (t). In this subsection we show how to obtain such derivative estimates from F (, t) as functions of t, , and . We may obtain an estimator for algorithm:
(k )

for any k 0 by the following recursive

98

Tracking Implicit Trajectories

Chap. 4

Algorithm 4.3.1 Derivative Estimator Algorithm. Data: i. k Z+ . ii. The function F (, t), assumed to be C k in and t. If k = 1: Let E 1 (, , t) = D2 F (, t). If k > 1: E k (, , t) = where E (, , t) =
n i=1 i D1 F (, t) d k1 (, , t) =E 1 (,,t), =E (,,t) dt E

Ei1 (, , t) + D2,1 F (, t)

Output: E k (, , t)

(see Section 2.4, Remark 2.4.3). Recall that E (, , t) is the estimator for
(i)

By construction, the estimators E i(, , t) produced by Algorithm 4.3.1 satisfy

E i( , , t) = (t), and by Assumption 4.2.6, E i(, , t) is continuously dierentiable in each of its arguments for and suciently close to and . Remark 4.3.2 Note that in general, d i E (, , t) = E i+1 (, , t). dt Only at (, ) = ( , ) is equality guaranteed. Example 4.3.3 Application of the Derivative Estimator Algorithm. Let F (, t) := k sin() cos()u Then E 1 (, , t) = D2 F (, t) = ( cos()u ) = cos()u To get E 2 (, , t), rst E (, , t) = (k sin() + cos()u) E 1 (, , t) + sin()u (4.44) (4.43) (4.42) (4.41)

Sec. 4.3 Then, E 2 (, , t) = = =

Tracking Control

99

d 1 =E 1 (,,t) =E (,,t), dt E (, , t) d =E (,,t), =E 1 (,,t) dt cos()u

u cos()u sin() + cos() u

(4.45)
=E 1 (,,t) =E (,,t),

= E (, , t) cos()u sin()E 1 (, , t)u + cos() u

4.3.4

Combined Dynamic Inverter and Plant


In this subsection we combine the dynamic inverter (4.37) with derivative estima-

tors and the plant (4.11) to get the closed-loop system [C, P ]. Let Eik (, , t) denote the ith component of the vector-valued function E k (, , t). new control law Substitute the estimators E k (, , t) for , k r , into the control law (4.36) to get the
(k )

Implicit Tracking Control Law


ri

u i (, , , t) =

Eiri (, , t)

k=1

k k i (i Eik1 (, , t))

(4.46)

Combining plant (4.11), dynamic inverter (4.40), and control law (4.46) gives the closed-loop system Implicit [C, P ] Tracking Controller and Plant (4.11) j i ri
i j +1 = i , i p, j ri 1

= u i (, , , t)

= (, ) + (, ) u(, , , t) F (, , t), , + E (, , t) = G

(4.47)

4.3.5

An Implicit Tracking Theorem


Let (t) := [1 , 1 , . . . , 11
(0) (1) (r 1)

, 2 , . . ., 22

(0)

(r 1)

, . . . , pp

(r 1) T

] Rsr .

(4.48)

100

Tracking Implicit Trajectories

Chap. 4

The following theorem gives sucient conditions under which the closed loop system [C, P ] (4.47) will solve the asymptotic implicit tracking problem, Problem 4.2.12. Theorem 4.3.4 Implicit Tracking Theorem. Assume that i. (t) is a continuous isolated solution of F (, t) = 0, and G[w, , t] is a dynamic inverse of F (, t), ii. plant (4.11) has output-bounded internal dynamics, iii. () is in B , iv. the right-hand side of system (4.47) is C 2 in its arguments and all of its partial small. derivatives up to order 2 are bounded for all , , and ( , ) suciently
( r)

If ((0) (0)), ( (0) (0)), ( (0) (0)), , and (0) are suciently small, then there exists a > 0, and a > 0 such that for all (0) B Rsr , and for all positive > , the output y (t) of (4.47), converges exponentially to (t), while (, , ) remains bounded. Proof of Theorem 4.3.4: After some coordinate changes are applied to [C, P ] (4.47), we will prove exponential convergence of Y (t) to (t) for suciently large using singular perturbation theory8 . We will rely upon a proof from Khalil [Kha92], restated in Appendix B, to prove exponential convergence of (t) to (t). Then we will show that bounded internal dynamics is achieved. A. Let a > 0 exists such that if (0) B Rsr , then exponentially convergent tracking with := 1/. Dene a coordinate change (, , , ) (e, , w, z ) by ej i
j = i i (j 1)

, (4.49)

w = , z = ,

with left unchanged. Let u = u as dened by (4.46). Through substitution of the error
See Kokotovic, et al. [KHO86] for a review of singular perturbation theory in the context of control theory.
8

Sec. 4.3

Tracking Control

101

coordinates (4.49) into (4.46), as well as some algebra we get u i (, , , t) = Eiri (, , t)


(r ) ri k=1 k k i i Eik1 (, , t) (r )

= ii + Eiri (, , t) ii =

(k1) ri k k k=1 i ei + i (r ) (r ) i k k ii + Eiri (, , t) ii r k=1 i ei ( ri ) (k1) + k=1 i Eik1 (, , t) i ( ri ) (r ) ri i k k ii r k=1 i ei + Ei (w + , z + , t) i (k1) k k i + r k=1 i Ei (w + , z + , t)

Ei

(k1)

(, , t)

(4.50)

Substitute the resulting expression for u into (4.47) to get [C, P ] in error coordinates, j +1 e j , i p, j ri 1 i = ei ( ri ) r ri k k i i e i = r k=1 i ei + Ei (w + , z + , t) i (k1) k k i + r k=1 i Ei (w + , z + , t) i = (, ) + (, ) u(, w + , z + , t) (4.51) w F (w + , z + , t), w + , z + = G z (w + , z + , t) + E

B. Now consider the error system obtained from (4.51) by omitting the -dynamics, j +1 e j , i p, j ri 1 i = ei ( ri ) r ri i i k k e i = r k=1 i ei + Ei (w + , z + , t) i (k1) k k i + r k=1 i Ei (w + , z + , t) i w z F (w + , z + , t), w + , z + = G + (w + , z + , t) E

(4.52)

We will show that (4.52) satises assumptions i through v of Theorem B.3.1 of Appendix B. Note the following: i. The origin (e, z, w ) = (0, 0, 0) is an equilibrium of (4.52). ii. The equation obtained from the dynamic inverter part of (4.47) by setting namely F (w + , z + , t), w + , z + 0 = G (4.53) = 0,

102

Tracking Implicit Trajectories has an isolated solution at (w, z ) = (0, 0).

Chap. 4

iii. By assumption, the right hand side of (4.52) and its partial derivatives up to order 2 are bounded for suciently small (e, w, z ). iv. For the linear time-invariant system e j i
ri e i +1 = ej , i p, j ri 1 i

ri k=1

k ek i i

(4.54)

e = 0 is an exponentially stable equilibrium. v. The origin (w, z ) = (0, 0) of d d w z F (w + (t), z + (t), t), w + (t), z + (t) = G (4.55)

is exponentially stable uniformly in t (Equation (4.55) is a dierential equation in , with solution (w ( ), z ( )), where t is considered xed. See Theorem B.3.1 of Appendix B). Then by Theorem B.3.1 of Appendix B there exists a > 0 such that for all < , (e, w, z ) = (0, 0, 0) is an exponentially stable equilibrium of (4.52). Since = 1/ and = 1/, it follows that there exists a > 0 such that for all > , the origin (e, w, z ) = (0, 0, 0) of (4.52) is and ( (0) (0)) are suciently small. exponentially stable. Thus (t) goes to (t) exponentially if ( (0) (0)), ((0) (0)), C. Exponential stability of the origin of (4.52) implies exponentially convergence

of y (t) to (t), and of ( (t), (t)) to ( (t), (t)). Since (4.52) does not depend on , this exponential output error convergence is unaected by the evolution of . Nevertheless, we must assure ourselves that remains bounded. Since the plant (4.11) is assumed to have output-bounded internal dynamics, there exists a > 0 and a > 0 such that () B and (0) B imply (t) B for all t 0. in B as (t) converges to (t) (see Figure 4.11). By assumption,
( r) ( r)

We must show that there exists a > 0 such that if (0) is in B , then ( (t), u(t)) remains < which implies that ( (t), (t)
( r)

< (see (4.9)) for

for part B (above) of this proof. However, (0) may converge to (0) in such a way that for some t > 0, ( (t), u(t))

B , (0) is suciently close to (t) to satisfy the requirements of exponential convergence > even though the convergence is asymptotic. Thus the norm

each i p. If is suciently small, then there exists a such that for all ( (0), u(0)) in

Sec. 4.3

Tracking Control

103

R sr+p

((0),u(0)) ((0),(0))

Figure 4.11: As ( (t), u(t)) converges to ( (t), r )(t)) it must remain in B .

imply that the norm decreases monotonically in t. For example, the function et sin(t) is a function that converges to zero exponentially and whose norm is not monotonic in t (see Figure 4.12). We must assure ourselves that there exists a suciently small so that (0)
( r)

x(t) y (t) need only be bounded above by a decaying exponential function. This does not

< implies ( (t), u(t))

. Figure 4.12 illustrates the situation we would like to avoid.

< for t 0 in order to preserve the boundedness of

and k2 > 0 such that

Exponential convergence of (t) to (t) as t implies that there exist k1 > 0,

(t) (t) k1 (0) (0) ek2 t

(4.56)

for all t 0. By choosing and suciently small, we can make (0) (0) as small ()
( r)

as we please. Therefore, we can guarantee by choice of and suciently small, that < . Since the plant P has output-bounded internal dynamics, this guarantees that (t) remains in B for all t 0.

104

Tracking Implicit Trajectories

Chap. 4

R sr+p

((0),u(0)) ((0),(r)(0))

Figure 4.12: If and are not suciently small, then ( (t), u(t)) may converge exponentially to ( (t), (t)) but leave the ball B at some time.

R sr+p R n-sr ((0),u(0)) ((0),(r)(0)) (0) 0 (t)

Figure 4.13: If ( (0), u(0)) is in B , and (0) is in B , then (t) < for all t 0. The B is that ball in which ( (t), u(t)) must remain in order that (t) remain in B . Compare to Figure 4.5.

Sec. 4.4

An Example of Implicit Tracking

105

4.4

An Example of Implicit Tracking


An example will illustrate application of Theorem 4.3.4 to a nonlinear control

system having unstable zero dynamics, but output-bounded internal dynamics. Example 4.4.1 Implicit Trajectory Tracking. We will construct an implicit tracking controller for the plant (4.26) of Example 4.2.11. Recall that plant (4.26) has outputbounded internal dynamics, but unstable zero dynamics. The tracking problem for plant (4.26) corresponds to the tracking problem for the cart of Figures 4.2 through 4.4, where we wish to cause the cart to track an implicitly dened trajectory without having the ball leave the bowl. Assume that we would like y (t) to track the solution (t) to F (, t) = 0 where F (, t) := (2 + sin(t)) tan(/10) 1. (4.58) (4.57)

1 The equation F (, t) = 0 has an explicit solution, (t) = 10 arctan 2+sin( . This will t)

allow us to verify performance of the closed loop system resulting from application of the implicit tracking controller. We will construct the controller, however, as if only an implicit expression F (, t) = 0 for the reference trajectory were available. First we will construct the necessary derivative estimators. We know from Lemma 2.2.7 on dynamic inverses for scalar of illustration, and in the manner of Theorem 2.4.6, we will use a dynamically estimated dynamic inverse G(w, ) = w , where (t) is the solution to D1 F ( , t) 1 = 0, and D1 F (, t) = 1 (2 + sin(t)) sec2 (/10). 10 (4.59) functions, that we can use G(w ) = 1 w as a dynamic inverse for this F (, t), but for purposes

Use the derivative estimator algorithm, Algorithm 4.3.1, to obtain estimators E 1 (, , t) = cos(t) tan(/10), and E 2(, , t) =
1 10

(4.60)

cos(t) sec2 (/10) (4.61)


1 10

1 + 50 (2 + sin(t)) sec2 (/10) tan(/10)E 1(, , t)2

sin(t) tan(/10) +

cos(t) sec2 (/10)E 1(, , t) .

106 We have
d =E 1 (,,t) dt D1 F ( , t)

Tracking Implicit Trajectories

Chap. 4

1 10

cos(t) sec2 (/10)

1 + 50 (2 + sin(t)) sec2 (/10) tan(/10)E 1(, , t)

(4.62)

we have Then, for estimation of E (, , t) := . where E ( , , t) =


d =E 1 (,,t) dt D1 F (, t)

(4.63)

Let 2 = 1 = 1. For the implicit tracking controller dene u (, , t) := E 2 (, , t) 2 ( E 1 (, , t)) 1 ( ). (4.64)

Combine the dynamic 1 = 2 = [C, P ] = = =

inverter and plant to get 2 u (, , t) (D1 F (, t) I ) + E (, , t) F (, t) E 1 (, , t) ( + 3)( + 1) ( 1)( 3) + 1 + ( 2 )2 + u (, , t) (4.65)

where u (, , t) is given by (4.64).

4.4.1

Simulations
Figures 4.14 through 4.17 show the results of a simulation of (4.65) with the initial

conditions shown in Table 4.1. variable 1 2 initial value 3 -1 0 1 1

Table 4.1: Initial Conditions for the implicit tracking controller simulation. The parameters used in the simulation are shown in Table 4.2. The simulation was integrated using an adaptive step size, fourth and fth order Runge-Kutta integrator ode45 in Matlab [Mat92].

Sec. 4.4

An Example of Implicit Tracking parameter 1 2 value 1 1 10

107

Table 4.2: Parameters for the implicit tracking controller simulation. The top graph of Figure 4.14 shows the output y (t) (solid), the estimator (t) (dashed), and the actual reference trajectory (t) (dotted). Convergence of the three trajectories is readily apparent. The bottom graph of Figure 4.14 shows the output tracking error y (t) (t) for the simulation. The decay of the tracking error can be seen. Figure 4.15 shows the internal state (t). Note that was initialized at 0 which

is unstable. It can be seen that settles to a region between one of its zero dynamics equilibria, = 1, and a value of = 1.6. Most importantly, stays bounded. Figure 4.16 shows the estimation errors for (top) and (bottom). Both errors can be seen to decay to zero.

108

Tracking Implicit Trajectories

Chap. 4

* (dotted), y (solid), (dashed)

1 0
0.5

10

12

14

16

18

20

-0.5

y - *
-1 -1.5 -2 0

10

12

14

16

18

20

Figure 4.14: Top: The output y (t) (solid), as well as the implicit reference trajectory (dotted), and its estimate (t) (dashed) for the simulation of Example 4.4.1. Bottom: The output tracking error y (t) (t)

Sec. 4.4

An Example of Implicit Tracking

109

1.6

1.4

1.2

0.8

0.6

0.4

0.2

0 0

10

12

14

16

18

20

Figure 4.15: The internal state (t) for the simulation of Example 4.4.1.

1 0

- *

-1 -2 -3 -4 0 2 4 6 8 10 12 14 16 18 20

t
1 0

- *

-1 -2 -3 -4 0 2 4 6 8 10 12 14 16 18 20

Figure 4.16: The top graph shows the estimation error (t) (t) for Example 4.4.1. The bottom graph shows the estimation error (t) (t).

110

Tracking Implicit Trajectories

Chap. 4

Figure 4.17 shows four phase plots. The top left plot shows 1 versus 2 . Compar we can see how the output converges to ing this to the top right plot showing versus the implicit reference trajectory. The bottom left plot is not a true phase plot. It shows versus E 1 (, , t). Recall from Remark 4.3.2 that only when (, ) = ( , ) is E 1 (, , t) . The plot at the lower right of Figure 4.17 shows the tracking error guaranteed to equal ) = (e1 , e2 ) as it converges to zero. phase (y , y

. *
2 4 6 8

-5

-5

1
5 2 1

E1 ()

e2
2 4 6 8

0 -1

-5

-2 -2

e1

Figure 4.17: The top left graph shows the phase plot of 1 versus 2 . The top right graph . The lower left graph shows versus E 1 (, , t). The lower right graph shows versus . The symbol o marks the initial shows the tracking error phase, 1 versus 2 conditions for each plot.

The combination of plant and compensator can be seen to have behaved as predicted, with exponentially decaying output error, and with bounded internal dynamics.

Sec. 4.5

Chapter Summary

111

4.5

Chapter Summary
We have dened a useful characterization of internal dynamics for nonlinear sys-

tems which we have called output-bounded internal dynamics. A nonlinear control system with output-bounded internal dynamics has an acceptable form of internal behavior without having stable zero dynamics. We have combined a dynamic inverter with a standard tracking controller, replacing the explicit reference trajectory and its time-derivatives by estimators based on dynamic inversion to produce a controller for tracking implicit reference trajectories. For systems having output-bounded internal dynamics, we have seen that for suitable initial conditions and gain parameters, the implicit tracking controller keeps the internal dynamics bounded. We have proven, though an appeal to singular perturbation theory, that the combination of nonlinear plant, dynamic inverter, and controller results in exponentially convergent output tracking with bounded internal dynamics for plants having output-bounded internal dynamics. A simulation of a controlled nonlinear system with unstable zero dynamics, but output-bounded internal dynamics was shown to exhibit the predicted convergence and stability behavior. We have used the particular dynamic inverter of Theorem 2.4.6. A review of the proof of the implicit tracking theorem, Theorem 4.3.4, easily reveals that this is not the only dynamic inverter that may be used. Any dynamic inverter conforming to the assumptions of Theorem 2.3.5 will do for the example of Section 4.4 above as well as for the implicit tracking theorem. For instance, if D1 F (, t)1 is indeed available in closed form, then it may be substituted for in the controller equations. Then one need not solve a dierential equation for .

112

Chapter 5

Joint-Space Tracking of Workspace Trajectories in Continuous Time


5.1 Introduction
In this chapter we will apply the implicit tracking theorem, Theorem 4.3.4, to the problem of tracking end-eector trajectories for robotic manipulators. The control of robotic manipulators provides considerable motivation for the application of dynamic inversion to the implicit tracking problem. Inverse kinematics is currently a computationally expensive procedure, limiting, as we will discuss in Section 5.3, the performance of robot controllers for important classes of tasks such as the tracking and stabilization of end-eector trajectories, the task which will be considered here. The limitation is suciently severe that many commercial robotic manipulators are designed so that there exist closed-form solutions for their inverse-kinematics. Using dynamic inversion, one need only approximately solve the inverse kinematics problem at a single conguration. A dynamic inverter uses this solution as the initial condition for a dynamical system, the ow of which is the time-varying solution to the inverse-kinematics problem. Coupling this inverse-kinematic solution to a tracking controller for the robot arm, and applying the implicit tracking theorem, Theorem 4.3.4 gives exponentially convergent tracking.

5.1.1

Previous Work
The map F (, t) to be inverted in this chapter is of a special form F (, t) = F () xd (t) (5.1)

Sec. 5.1

Introduction

113

in which an exclusively time-dependent term is combined additively with an exclusively -dependent term. The inverse problem of approximating (t) given xd (t) has received a signicant amount of attention in the literature and, indeed, continuous-time algorithms for solving inverse kinematics are classical (see [Nak91]). Wolovich and Elliot [WE84] inwhere x is the cartesian conguration vector and is the vector of joint angles. Their dynamical system may be expressed as = K DF ()T (F () xd (t)) (5.2) troduced a set of dynamical equations that solve the inverse kinematics problem x = F ()

where K is a positive-denite matrix. Viewing this equation through in the dynamic inversion framework, we see immediately that this is a dynamic inverter that solves for the solution to F () xd (t) = 0 (5.3)

with dynamic inverse KD F () and no derivative estimator. The lack of a derivative estimator in (5.2) limits the performance of the algorithm (5.2) unless x d (t) is small, or the leftmost eigenvalue of K is large. In contrast we will use derivative estimation as specied in Theorem 4.3.4, and we will be concerned with the tracking problem rather than solely with the inverse kinematics problem. Tcho n and Duleba introduced a dynamical method for determining inverse kineform matics for manipulators with forward kinematic maps F which may be expressed in the = DF ()T adj DF () DF ()T F () (5.4)

where adj() refers to the classical adjoint. Here DF is assumed to be surjective allowing for inverse solutions for redundant manipulators. In the dynamic inversion context it is clear that G[w ] := DF ()T adj DF () DF ()T w (5.5)

is being used in (5.4) as a dynamic inverter for F () since G[w ] is simply a left inverse of DF multiplied by det(DF DF T ). Assuming that F () is analytic and expanding F () in a Taylor series about makes it clear that G[w ] is a dynamic inverse of F () in a suitably and Duleba are concerned with inversion at particular congurations of the end eector, this is not surprising. Lack of such a derivative estimator, however, limits the utility of their algorithm in the context of tracking end-eector trajectories. Note also that determination of adj(DF ) is similar to determination of DF 1 which is something we would like to avoid.

small neighborhood of . Note that no derivative estimator E (, t) is used. Since Tchon

114

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

Another continuous-time dynamical handling of inverse kinematics similar to that which we describe below has been presented by Nicosia et al. in [NTV91a]. Their approach too ts well into the framework of dynamic inversion as described in this dissertation. Both G(w, ) = D1 F (, t)1 w and G(w, ) = D1 F (, t)T are used by those authors as dynamic inverses. Derivative estimation similar to the technique presented in Chapter 4 is also used, though rather than assuming knowledge of the time-derivatives of xd as we will do here, Nicosia et al. use an observer to estimate those derivatives. They also rely heavily upon the availability of D1 F (, t)1 . Though such reliance is often feasible in practice, we will not require it, relying instead upon dynamic estimation of a dynamic inverse.

5.1.2

Main Results
The main result of this chapter are as follows:

i. We dene four classes of robotic manipulator controllers based upon whether the errors used for the control are in the workspace or the joint-space, and further subdivided by whether the object of the controller is to stabilize a workspace trajectory or a joint-space trajectory. ii. We introduce a controller that provides joint-space tracking of workspace trajectories. The controller is posed in continuous-time. Its digital computer implementation then requires only integration of an exponentially stable dynamical system. The heart of this chapter may be viewed as an application of the implicit tracking theorem, Theorem 4.3.4, to robotic manipulator control. In this application, however, internal dynamics are assume to be absent or ignorable.

5.1.3

Chapter Overview
In Section 5.2, after some necessary denitions, we precisely dene the robotic

control problem in which we will be interested. In Section 5.3 we then describe some current methods of robot manipulator control, looking very briey at some of their strengths and shortcomings. In Section 5.4 we apply the implicit tracking theorem, Theorem 4.3.4, to construct an exact tracking controller for the tracking of end-eector trajectories. In Section 5.5 an example of output tracking for a simple model of a two-link robot arm is used to illustrate the application of the implicit tracking theorem.

Sec. 5.2

Problem Denition

115

5.2

Problem Denition
Let the vector of joint angles1 of the robotic manipulator be denoted Rn ,

control of open-chain robotic manipulators3 having equations of motion in the standard form Robotic Manipulator Dynamic Equations = K (, ) + M ( ) (5.6)

and the corresponding generalized torques2 be Rn . We will concern ourselves with the

where the inertia matrix M () Rnn is positive-denite and symmetric for all Rn . ) contains all Coriolis, centrifugal, frictional, damping, and gravitational The vector K (, forces. manipulator to the conguration of the end-eector x, The forward-kinematics map F relates the generalized coordinates of the robotic

Forward-Kinematic Relation (5.7) x = F ( ) Depending upon the particular manipulator, x may take values in various sets, including subgroups of the special Euclidean group SE (3), the group of positions and orientations in Euclidean 3-space. We call the set of all possible end-eector congurations x the to a pose of the end-eector. The map F () will be assumed to be C 2 . workspace, X . We call each element of the workspace a pose since each x X corresponds Typically the joint-space is a non-euclidean manifold. For the purposes of this

chapter we may view the joint-space, as well as the workspace, which we will assume to be of the same dimension as the joint-space, through charts4 from Rn . For simplicity, we will at rst avoid a discussion of redundant manipulators, manipulators having degrees of freedom greater in number than the dimension of the space
By a joint angle we mean a parameter uniquely describing a joint conguration. Thus, for instance, the angle may parameterize a prismatic as well as a rotary joint. The vector of joint-angles is a set of generalized coordinates for the robotic manipulator. 2 We will use the term torques to mean control forces or control torques as appropriate. 3 By open-chain robotic manipulator, we mean a nite sequence of rigid links, the rst link being hinged to the ground, with all successive links hinged to the previous link by a joint. The end of the last link is presumed to be free to move in the workspace. 4 See [AMR88], Chapter 3 for a review of manifolds and their associated charts.
1

116

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

in which xd (t) resides. Then in Remark 5.4.4 we will show how our controller may be easily adapted for use in redundant manipulators. Discussion of singularities in the inversekinematics problem will also be avoided. We will assume that the inverse-kinematic image of the desired work-space path, i.e. (t) satisfying F () xd (t) = 0 does not pass through such a singularity. The workspace tracking problem considered here is as follows: Problem 5.2.1 Workspace Tracking Problem. Find a control (, t) such that for all initial conditions 0 Rn in an open subset of Rn , the pose x(t) of the end-eector converges exponentially to the desired end-eector trajectory xd (t).

Assumption 5.2.2 Smoothness. Assume that the desired end-eector trajectory xd (t) is C 4 on [0, ), that the forward-kinematic map F () is also C 4 on Rn , and that DF (z + (t)), DF (z + (t))1 , and D 2 F (z + (t)) are bounded uniformly in t for all z Br . Assumption 5.2.2 will provide the degree of smoothness necessary to invoke the implicit tracking theorem. is to nd p satisfying xp = F (p). In general, multiple solutions p exist. For simplicity Given a particular end-eector pose xp X , the inverse-kinematics problem

we will further restrict the space from which we draw desired output trajectories xd (t) to

those output trajectories that have corresponding, though possibly multiple, continuous isolated solutions (t), i.e. that do not go through singularities. For robotic manipulators ni , the inverse function theorem (see [AMR88], Section 2.5, page 116) implies that inversekinematic solutions are isolated. Exceptions to this isolation are at discrete singularities where DF () drops rank. Thus, the added restriction on xd (t) is mild. For simplicity in presenting the controller, we will consider only manipulators with ni = n. Since there are in general multiple continuous isolated solutions (t) of F () = having ni input torques and n degrees of freedom, ni n, with the rank of DF () being

xd (t) we will assume that a particular one has been chosen. It will be demonstrated below, in that will cause the manipulator to follow one inverse-kinematic solution over another.

Section 5.5, that it is only a matter of choice of initial conditions for the tracking controller

5.3

Manipulator Tracking Control Methodologies


Current techniques of tracking control for robotic manipulators (see [Cra89], [SV89],

[MLS94]) can be divided into two classes according to whether tracking-error feedback is

Sec. 5.3

Manipulator Tracking Control Methodologies

117

xd(t3) xd(t2) xd(t1) xd(t)


Workspace

d(t1) d(t2 ) xd(t4)

(t) d(t3)

Inverse-Kinematics Algorithm

Joint-space

d(t4)

t
t1 t2 t3 t4

Figure 5.1: A sequence of poses {xd (tk )} along the workspace trajectory are inverted via an inverse-kinematics algorithm. The resulting sequence of joint-space points {d (tk )} is then (t). splined to form d ) or joint-space errors (i.e. x xd , realized in terms of workspace errors (i.e. d , i. Joint-Space Control of Joint-Space Trajectories. A discrete inverse-kinematics via points, along a continuous desired pose trajectory t xd (t) X . This discrete algorithm is applied to a time-parameterized sequence of chosen points {xd (tk )}, called

x x d ). These classes are as follows:

inversion produces a corresponding time-parameterized sequence of joint-angle vectors {d (tk )}. One may then create, via a spline, a smooth time-parameterized curve (t) Rn through the sequence {d (tk )}, (see Figure 5.1) and then track (t) using a ) such as tracking controller described in terms of errors ( ) and ( ) ) + M ( ) B 2 ( ) B 1 ( = K (, where B 1 and B 2 are positive-denite gain matrices in Rnn . ii. Workspace Control of Workspace Trajectories. To transform the dynamic equations of the robot into workspace coordinates, dierentiate the forward-kinematics relationship x = F () twice with respect to t, x = x = D F ( )
n i=1 i D F ()i

(5.8)

+ D F ( )

(5.9)

118

Joint-Space Tracking of Workspace Trajectories in Continuous Time , and solve for


n

Chap. 5

= DF ()1 x

i=1

i D F ( ) i

(5.10)

into the manipulators dynamical equations (5.6) Substitute the result (5.10) for
n

M ( )D F ( )

x = D F ( )

1 i=1

i K (, ) + D F ( ) i

(5.11)

and left-multiply both sides of (5.11) by (DF ()1 )T , (DF ()1 )T M ()DF ()1 x = (DF ()1 )T DF ()1 (5.12)

n i=1 i D F ()i

) + (DF ()1 )T K (,

Then choose gain matrices B 1 and B 2 in Rnn for error feedback in terms of these x workspace errors ( d ) and (x xd ) to obtain a tracking controller for tracking the desired xd (t) X as follows: n 1 = DF ()1 i=1 i D F ()i K (, ) + M ()D F () v v = x d (t) B 2 (x x d (t)) B 1 (x xd (t))

(5.13)

Each of the above approaches has its advantages and limitations. In the rst class of controllers, i above, if accuracy of end-eector pose is to be achieved, one must solve a great number of individual inverse-kinematic problems in order to nd the corresponding sequence of points in joint space. If a disturbance causes the end-eector to move substantially from its desired trajectory, a new sequence of workspace points may have to be inverted in order to fulll desired error dynamics. The joint-space spline from one via point to the next may correspond to a workspace path that diverges substantially from the desired workspace trajectory for points midway between the via points. This can cause a lack of uniformity in the workspace error as indicated schematically in Figure 5.2. This approach has also necessitated a combined discrete-time, continuous-time approach to workspace tracking control of robotic manipulators. Intrinsically, it is not real-time since the next via point in the joint space must be determined before the spline from the previous via point can be created, time-parameterized, and tracked. Joint-space control does have an advantage in that joint parameterizations are global. Thus one need not change coordinates in the middle of a control task.

Sec. 5.3

Manipulator Tracking Control Methodologies

119

Workspace
xd(t3) F ((t))

Jo t-S in ce pa

xd(t) xd(t4)

(t)

d(t1)

d(t2 ) d(t3)

d(t)

xd(t2)

xd(t1) d(t4)

Figure 5.2: The black curve on the left corresponds to the desired end-eector trajectory xd (t). The black dots on the left correspond to points of xd (t) at a discrete sequence of times t1 < t2 < t3 < t4 . The black curve on the right corresponds to the inverse kinematic solution d (t) satisfying F (d (t)) = xd (t). The black dots on the right correspond to the inverse kinematic solutions d (tk ) satisfying F (d(tk )) = xd (tk ). The white curve on the (t) through the sequence {d (tk )}. The right corresponds to a time parameterized spline (t)) is nonwhite curve on the left is F ((t)). Note that the error between xd (t) and F ( uniform, going to zero at the sample points, and diverging from xd (t) away from the sample points. In the second class of controllers, ii above, (see [MLS94], Section 5.4 for more details on workspace control) one need not solve for an inverse-kinematic solution, though DF () must be inverted. This method too can be undesirable since the inputs to the manipulator are often joint torques. Avoidance of saturation of the joint torques, for instance, is made dicult. If the mass matrix of the manipulator cannot be conveniently inverted symbolically, then, once again, a mixed discrete and continuous time control scheme is at the start of each necessitated in order to apply, e.g. Gauss elimination, to solve for continuous control interval. In addition, since the workspace is usually SE (3), and since no global parameterization of SE (3) exists5 , this approach can necessitate the overhead of coordinate changes in the controller implementation. However, specifying control gains in
5

Quaternion representations of SE (3) can make this problem less serious.

120

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

the workspace coordinates can be advantageous for certain combinations of manipulator and task. In fact, the two control strategies above suggest the existence of two more control strategies. By distinguishing between workspace and joint-space control, as well as between workspace and joint-space trajectories we see that there are in fact four distinct strategies, as illustrated in Figure 5.3:

Four Classes of Robotic Manipulator Control JCJT. Joint-space control of joint-space trajectories, WCWT. Workspace control of workspace trajectories, WCJT. Workspace control of joint-space trajectories, JCWT. Joint-space control of workspace trajectories. (5.14)

Strategy JCJT corresponds to i above, and strategy WCWT corresponds to ii above. This chapter describes a JCWT method, symbolized by the black arrow in Figure 5.3, of joint-space control of workspace trajectories based on dynamic inversion and the implicit tracking results of the last chapter. This alternative allows one to pose the controller in joint-space while continuously providing an estimate of (t) satisfying xd (t) = F ( (t)) allowing continuous-time control in joint-space. The continuous time approach also has the virtue of a degree of independence of choice of computational machinery. For realization of the control via digital computer one must choose an integrator in order to integrate the dynamic inverter. The issue of accuracy, however, is made solely a matter of the choice of integrator. Using our method, we also retain the advantage of global control coordinates.

5.3.1

Workspace Control of Joint-space Trajectories


Workspace control of joint-space trajectories (WCJT) is easily obtained from

workspace control of workspace trajectories (WCWT) by replacing xd (t), x d (t), and x d (t) in 5.13 by d (t), xd (t) = F (d(t)), x d (t) = DF (d (t)) (5.15)

Sec. 5.4

Joint-Space Control of Workspace Trajectories

121

Joint-space Trajectories

Workspace Trajectories

WCJT

JCWT

JCJT

WCWT

Joint-space Control

Workspace Control

Figure 5.3: The four robot control strategies are represented each by one of the four arrows. This chapter presents a JCWT strategy, indicated by the black arrow.

and
n

x d (t) =
i=1

di (t) d (t) + DF (d (t)) d (t). DF (d (t)) i

(5.16)

Though WCJT completes the picture illustrated in Figure 5.3, any advantage in its use is unclear at present. It may be useful in cases where one wishes to control joint motions on a robotic manipulator through exogenous forces applied to the end eector.

5.4

Joint-Space Control of Workspace Trajectories


We now apply dynamic inversion to the problem of tracking workspace trajectories

using joint-space control. Given a desired end-eector trajectory t xd (t), the inverse

kinematic solution (t) to F () = xd (t) is dened implicitly as a continuous isolated

122

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

solution of F (, t) = 0 where

F (, t) := F () xd (t)

(5.17)

The use of dynamic inversion for the tracking of implicitly dened trajectories is described in Chapter 4. Those arguments will be specialized here to the case of robotic manipulator control. Note that F (, t) is a sum of one -dependent term and one t-dependent term. Thus, assuming xd (t) is C 2 in t, we have the necessary uniformity in t to conclude from the dynamic inverse existence lemma, Lemma 2.2.11, that we may use a dynamic inverse G(w ), linear in w and based on (DF ())1 . From the form of the manipulator dynamical equations (5.6), and since M () is

positive-denite, it is clear by substitution into (5.6) that the feedback torque

) + M ( )v = K (, applied to (5.6) causes the resulting controlled manipulator dynamics = v

(5.18)

(5.19)

2 1 i , i R, i n be such that the roots of the polynomial in s, 2 1 s2 + i s + i , in

error between the manipulator conguration and the inverse-kinematic solution . Let

to be linear from input to state, as well as decoupled6 . Let e := denote the tracking

(5.20)

(t), and (t). have strictly negative real parts. Suppose we had explicit signals (t), Choosing v in (5.19) as
1 (t)i 2 ( vi := i i (t)i ) i (i (t)i ) (t)i 2 e = 1e i i

(5.21)

results in controlled manipulator dynamics having exponentially stable tracking error. If the trajectory (t) were given explicitly, our job would be done. However, we do not have (t), and (t) since we do not have an explicit expression explicit expressions for (t),
6

i = vi . By the dynamics being decoupled we mean that for each i p,

Sec. 5.4

Joint-Space Control of Workspace Trajectories

123

and that will depend upon the state of a for (t). We will construct estimators for dynamic inverter as well as the desired workspace trajectory xd (t). We may approximate the time derivatives of (t) using Algorithm 4.3.1. We and E 2 , , t for . Recall that Rnn is part require approximators E 1 (, , t) for of the state of the dynamic inverter used in the construction of a dynamic inverse. For E 1 , E 1 (, , t) = D2 F (, t) = x d (t) we get For E (, , t), the estimator of E (, , t) = D1,1 F (, t) E 1 (, , t) + D2,1 F (, t) =
n i=1 i D F ()

(5.22)

E 1 (, , t)

(5.23)

Then for E 2 (, , t) we get E 2 (, , t) = = Summarizing,


d 1 =E 1 (,,t), =E (,,t) dt E (, , t) E (, , t)x d + x d

(5.24)

Derivative Estimators E 1 (, t) = x d (t) E 2 (, , t) = E (, , t)x d + x d E (, , t) =


n 1 i=1 i D F ()Ei (, t)

(5.25)

Remark 5.4.1 Notation. For the remainder of this chapter we will let denote the joint) denote the , angle vector, denote the inverse-kinematic solution to F (, t) = 0, and ( solution. Let
1 t) 2 ( E 1 ( ) , , , t)) i vi := Ei2 ( ( i i

as the estimator for the inverse-kinematic state of the dynamic inverter which includes

(5.26)

E 1 ( , t) and E 2 ( , t) are C 2 in their arguments.

re and where, as noted in Remark 5.4.1, we denote the estimators for and by , and E 2( , , t) = . Also, by Assumption 5.2.2, spectively. Recall that E 1 ( , t) =

124

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

t), we let ) = w for F (, In order to estimate a linear dynamic inverse G(w, ) := DF ( ) , I F ( (5.27)

, ) Rnn . As in Example 2.4.1 the dynamic inverse of F ( , ) is G : Rnn with F ( Rnn Rnn dened by ) = w G (w, , namely E (, , t) in (5.25). We have already obtained an estimator for Combining estimation, control, and manipulator dynamics we make the following claim. Corollary 5.4.2 Joint Space Controller for Workspace Trajectories. Let F () and xd (t)

(5.28)

system

given by (5.25). Then if ( (0), (0)) is suciently close to (DF ((0))1 , (0)), the control

its inverse be bounded for all t, and for all z Br , r > 0, let D 2 F (z + (t)) be bounded. j j , t), E 2 ( , , t), and E (, , t) be Let B j := diag( , . . . , n ), j {1, 2}. Let where E 1 (
1

X be C 4 . Let (t) be a continuous isolated solution of F () = xd (t). Let DF ( (t)) and

+ K (, ) = M ( ) ) + M ( )v ( , , t) = K (, t) = E 2 ( t) B 1 ( E 1( ), , , , , , t)) B 0 ( v ( = D F ( ) I ) xd (t) F ( + E (, , t) E 1 (, )

(5.29) (5.30) (5.31) (5.32)

causes the joint angle vector (t) to converge to the (t) exponentially as t . (t) (t) and Remark 5.4.3 Equations (5.32) provide exponentially convergent estimates of (t) and (t), Equations (5.29) are the equations of motion for the manipulator. Equa. , and tions (5.30) and (5.31) determine the input as a function of ,

Sec. 5.5

A Two-Link Example

125

Proof of Corollary 5.4.2: This is a straightforward application of Theorem 4.3.4 for the case of ignorable or nonexistent internal dynamics. Remark 5.4.4 Redundant Manipulators. In the case of a redundant manipulator, is of a dimension m greater than the workspace dimension n. Thus an innite number of joint angle vectors correspond to any particular end-eector conguration. Assuming that DF () is surjective, a tracking controller of the same form as the controller of Corollary 5.4.2 may be used. The only modication necessary to the controller of Corollary 5.4.2 for tracking for redundant manipulators is that is n m rather than n n. In that case is the right inverse of DF ( ) (see Chapter 3, Section 3.2.1). The derivative estimators too remain of the same form.

5.5

A Two-Link Example
In this section we work through an example of the application of Theorem 4.3.4

to the control of a simple model of a two-link robotic arm diagrammed in Figure 5.4.

m2 xd l2 2,2 m1

xd(t)

x(t) l1 1,1

Figure 5.4: A two-link robot arm with joint angles = (1 , 2 ), joint torques = (1 , 2 ), end-eector position x, desired end-eector position xd , link lengths l1 and l2 , and link masses m1 and m2 , assumed to be point masses. The links of the robot arm are assumed rigid and of length l1 and l2 . The masses

126

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

of each link are assumed, for simplicity, to be point masses m1 and m2 located at the distal ends of link 1 and link 2 respectively. The desired position of the end-eector at time t is xd (t). The actual position is x(t). We wish to make the end-eector (end of the second link) track a prescribed trajectory xd (t) in the Euclidean plane. The joint-space of the arm for our purposes we may view T2 through a single chart from R2 since neither joint of conditions. We will assume that for i {1, 2} we may exert a control torque i at the ith is parameterized by T2 where T2 is the 2-torus. As alluded to earlier in Section 5.3, the arm will ever undergo a full circular motion due to our choices of xd (t) and initial joint and will denote the vector of input torques by R2 . In this two-link manipulator case F : R2 R2 ; F () maps the conguration space to the Euclidean plane. Let the two-link arm, the forward-kinematics map is F ( ) = ci := cos(i ), cij := cos(i + j ), si := sin(i ), and sij := sin(i + j ), with i, j {1, 2}. For l1 c1 + l2 c12 l1 s1 + l2 s12

(5.33)

The workspace of the two-link robot arm is the codomain of F , namely {x R2 : x = converges to the desired end-eector position xd (t). are = V (, ) + W ( ) + M ( ) where
2 m + 2l l m c + l 2 (m + m ) M11 () = l2 2 1 2 2 2 1 2 1 2m + l l m c M12 = M21 = l2 2 1 2 2 2 2 M22 = l2 m2

F (), T2 }. We wish to determine a such that the end-eector position x(t) = F ((t)) The equations of motion for the two link manipulator (see [Cra89], Section 6.8) (5.34)

(5.35)

) = V (, and W ( ) =

2 2m2 l1 l2 s2 1 2 m2 l1 l2 s2 2 2 m2 l1 l2 s2
1

(5.36)

m2 l2 gc12 + (m1 + m2 )l1 gc1 m2 l2 gc12

(5.37)

) is the vector of The matrix M () is a positive-denite symmetric mass matrix, V (, centrifugal and Coriolis forces on the manipulator, and W () is the gravitational force on the point masses of the arm.

Sec. 5.5

A Two-Link Example

127

tracking an explicit joint trajectory d (t) we could choose an input torque ) + W ( ) + M ( ) B 2 ( ) B 1 ( ) = V (,

Let (t) be the solution of F (, t) = 0, where F (, t) := F () xd (t). If we were (5.38)

1 , 1 ) and B 2 = diag( 2 , 2 ), in order to achieve exponentially converwhere B 1 = diag(1 2 1 2

gent tracking. However, the trajectory we wish to track is dened implicitly as the solution (t) to F () xd (t) = 0. For the simple two-link robotic arm considered in this example, closed-form solu-

tions for the inverse kinematics exist (see Craig [Cra89], p.122). For demonstration purposes we will use dynamic inversion to invert the kinematics, and we will use the closed form of the inverse kinematics to check our results.

(x1, x2) 1

Figure 5.5: Two congurations corresponding to the same end eector position. For each x in the interior of the workspace, there exist two congurations satisfying F () = x as indicated in Figure 5.5. As long as xd is kept away from the boundary of the workspace, the two possible inverse-kinematic solutions of F (, t) = 0 never intersect7 . We will rst choose one inverse-kinematic solution, by our choice of initial conditions for dynamic inversion, and track it. Then we will change only our choice of initial conditions (0), (0)) for the dynamic inverter and track the other inverse-kinematic solution. ( and . First we apply the derivative estimation algorithm to get estimators for E 1 (, t) := x d (t) (t) is An estimator for E 2 (, , t) := E (, , t)x d + x d
7

(t), For the two-link arm we have as an estimator for

(5.39)

(5.40)

Where the two isolated solutions meet, D F () is singular.

128 where

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

E = =

n i=1 i D F ()

Ei1 (, , t)
1 (, t) + E1

l1 s1 l2 s12 l2 s12

l1 c1 l2 c12 l2 c12

l2 s12 l2 s12

l2 c12 l2 c12

1 (, t) E2

i ), s i ), c 1 + 2 ), and s 1 + 2 ). A dynamic Let c i = cos( i = sin( 12 = cos( 12 = sin( inverter for this two-link manipulator control problem is l1 s 1 l2 s 12 l2 s 12 t) , , I + E ( = l1 c 1 + l2 c 12 l2 c 12 l1 c 1 + l2 c 12 = x xd (t) + d (t) l1 s 1 + l2 s 12 Let t) = V (, ) + W ( ) + M ( ) E 2 ( t) B 2 ( E 1 ( ) , , , t)) B 1 ( (, , The resulting controller for the two-link arm is given as in Theorem 5.4.2. We choose xd (t) to be a time-parameterized gure-eight in the workspace, xd (t) = [3.75 cos(t), 2 + 1.5 sin(2t)]T (5.43)

(5.41)

(5.42)

Figure 5.6 shows the results of a simulation. The integration was performed in Matlab [Mat92] using the adaptive step size Runge-Kutta integrator ode45. The parameters and initial conditions used in the simulation are shown in Table 5.1 below. Table 5.1: The table on the left shows parameters for the simulation of implicit tracking control of a two-link robot arm. The table on the right shows initial conditions. All angles are in radians. parameter B1 B0 l1 l2 m1 m2 g value I I 10 3[m] 2[m] 1[kg] 1[kg] 9.8[m/s2] variable initial value 0 /2 0 1/3 1/2 1/3 /2 0 0

Sec. 5.5

A Two-Link Example

129

Workspace Paths
6
3 2.5

Joint-Space Paths

5
2

1.5 1

x2 [m]

2 [rad]
-4 -3 -2 -1 0 1 2 3 4 5

0.5 0

-0.5 -1 -1.5 -2 -1

-1 -5

x1 [m]

-0.5

0.5

1 [rad]

1.5

2.5

3.5

0.5 0.4

|| eest ||

0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

t
25 20

|| etrack ||

15 10 5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

t
Figure 5.6: The top left graph shows convergence of the workspace paths: F () (solid), ) (dashed), and F ( ) (dotted) corresponding to the initial conditions of Table 5.1. F ( (dashed), The top right graph shows convergence of the joint-space paths: (solid), and (dotted). For the top graphs the symbol o marks the initial condition for each trajectory. For the two bottom graphs, the upper one shows the l2 -norm of the estimation (t) (t), and the lower one shows the norm of the tracking error etrack = error eest = T (t))]T . [((t), (t))] [( (t),

130

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

The top left graph of Figure 5.6 shows the resulting end-eector path F () (solid), ) of the estimator for through the desired path xd = F ( ) (dotted), and the image F ( ) and the path of the end-eector x(t) can forward-kinematics map F (dashed). Both F (

be seen to converge to the desired path. The top right graph of Figure 5.6 shows a similar (dashed) for picture, but in joint space. Again, the convergence of both the estimator

the inverse-kinematic solution, and the actual joint-angles (solid) to the inverse-kinematic solution (dotted) corresponding to the desired trajectory can be seen. The upper bottom (t) (t) 2 , and the lower graph of Figure 5.6 shows the norm of the estimation error
2

graphed (0). versus time. The particular inverse-kinematic solution chosen was due to the choice of

(t)]T [ (t), (t)]T bottom graph shows the norm of the tracking error [(t),

5.5.1

Tracking the Other Solution


We may cause the arm to track the other inverse kinematic solution simply by

choosing a dierent set of initial conditions for the dynamic inverter. Figure 5.7 shows (0) as the results using the same parameters and manipulator initial conditions (0) and (0) and (0) as indicated in Table 5.2. above, but with dynamic inverter initial conditions Table 5.2: Initial Conditions for the simulation of implicit tracking control of the other solution for a two-link robot arm. All angles are in radians. variable initial value 1.1760 /2 0.3077 0.1282 0.5000 0.3333 /2 0 0

Figure 5.7. The top right graph of Figure 5.7 shows the corresponding joint-space paths, (dashed), and the bottom graphs show the estimation error (solid), (dotted), and (t) (t)
2

Again, the end-eector path F () (solid), desired path xd (t) (dotted), and the through F in the workspace F ( ) (dashed) are shown in top left graph of image of

(t)) ( (t), (t)) (upper) and tracking error ((t),

(lower). Once again

Sec. 5.5
Workspace Paths

A Two-Link Example

131

Joint-Space Paths
-0.5

-1

4
-1.5

2 [rad]
-2 -2.5 -3 0.5

x2 [m]

-1 -5 -4 -3 -2 -1

x1 [m]

1.5

2.5

3.5

1 [rad]

0.4

|| eest ||

0.3 0.2 0.1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

t
10

|| etrack ||

8 6 4 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

t
) Figure 5.7: The top left graph shows convergence of the workspace paths: F () (solid), F ( (dashed), and F ( ) (dotted) for the other inverse kinematic solution corresponding to the initial conditions of Table 5.5.1. Note that the path F ( ) is a periodic curve of period (dashed), 2. The top right graph shows convergence of the joint-space paths: (solid), and (dotted) for the other inverse kinematic solution. For the top graphs the symbol o marks the initial condition for each trajectory. For the bottom two graphs, the upper plot (t) (t), and the lower bottom graph shows the l2 -norm of the estimation error eest = (t))]T [( (t), (t))]T . shows the l2 -norm of the tracking error etrack = [((t),

132

Joint-Space Tracking of Workspace Trajectories in Continuous Time

Chap. 5

the tracking error can be seen to converge to zero.

Sec. 5.6

Chapter Summary

133

5.6

Chapter Summary
The implicit tracking controller of Chapter 4 has been applied to the robot control

problem of tracking of workspace trajectories using joint-space control. This approach provides exponentially convergent tracking of the inverse-kinematic solution corresponding to a continuous end-eector trajectory in the workspace. This results in exponential tracking of the desired end-eector path in the workspace. The controller has been posed in continuous time, using a dynamic inverter to produce approximations of the joint-space signals necessary for control. Though the two-link robot arm of Section 5.5 had simple rotary joints, it should be kept in mind that dynamic inversion may be used for inverse kinematics of manipulators with more complex joint geometry than simple prismatic or rotary joints. This includes, for instance, joints such as spherical joints
8

as well as joints with less regular geometries (see

Figure 5.8), like those found in the human body. A more general joint parameterization might require multiple parameters for a single joint. A more general forward-kinematic map F might reect changes in link length as a function of joint conguration. As long as our assumption on the rank and smoothness of F () hold and as long as a continuous

isolated solution exists, dynamic inversion and the implicit tracking theorem will work for such manipulators.

Figure 5.8: An irregular joint geometry.

Spherical joints may be modeled as the coincidence of three rotary joints

134

Chapter 6

Approximate Output Tracking for a Class of Nonminimum-Phase Systems


6.1 Introduction
In this chapter we study the tracking control problem for a class of time-invariant nonlinear nonminimum-phase control systems which we call balance systems. A balance system has associated with it a controllable linearization at its origin as well as certain other structural properties which we will exploit. Balance systems are a useful class of models for modeling physical systems for which gravitational balance must be maintained. Some examples of systems which are appropriately modeled as balance systems are bicycles, motorcycles, rockets, winged aircraft, and the inverted pendulum on a cart. The problem with which we will concern ourselves in this chapter is the output tracking problem, where we wish to cause the output of a balance system to track a desired reference trajectory, but we also wish to maintain the internal state within given bounds. We will not assume that balance systems have output-bounded internal dynamics (see Chapter 4, Denition 4.2.7). As an example of a balance system consider the cart and ball system of Figure 6.1. The output of the system is the position of the cart y . The input to the system is the acceleration of the cart u = y , and the internal state of the system is the position and velocity of the ball relative to a frame xed to the cart as shown. The ball is modeled as a particle, i.e. zero moment of inertia, and is assumed to be constrained vertically to

Sec. 6.1

Introduction

135

u
-1 0 1

Figure 6.1: A balancing cart-ball system. remain on the curved surface shown, but free to roll o of either lateral extreme of the carts surface. The origin of the cart-ball system corresponds to the cart being still (y = 0) at y = 0, with the ball at = 0 and motionless ( = 0). The class of tracking problems which we will consider is represented by the objective of controlling the carts position y to track a desired trajectory yd (t) without having the ball slide o of the cart. Critical to the denition of the problem class, however, is that we wish our controller to work for any reference trajectory yd () chosen from an open set of reference trajectories, where the open reference trajectory yd () to satisfy
t0

set contains the trajectory yd () 0. For our cart-ball system, for instance, we restrict the sup [yd (t), y d(t), y d (t)]T

<

(6.1)

for some

> 0.

6.1.1

Limitations on Tracking Performance


The tracking performance of systems such as the cart-ball system of Figure 6.1 has

some inherent limitations. In contrast to systems with output-bounded internal dynamics, the internal state (the position and velocity of the ball relative to the cart, (, )) cannot be ignored in the act of tracking an output trajectory given that we require that the ball remain on the cart. Recall (see Chapter 4) that for feedback linearizable systems with output-bounded internal dynamics, we can achieve exponentially convergent tracking of any suciently smooth output reference function yd if the derivatives of yd are suciently small. In particular, if the initial output tracking error is zero, then the output tracking error is zero for all time. A suciently small bound on the derivatives of yd (t) insures bounded internal dynamics. This is obviously not true for the cart-ball system above. For

136

Approximate Output Tracking

Chap. 6

instance, if we wish to track yd () 0, but the ball is not initially at the origin with zero velocity, then the ball will fall o of the cart. Causal Exact Tracking A more subtle limitation of the cart-ball system is that even if we are allowed to choose the initial condition ((0), (0)) of the ball, we still cannot achieve exact output tracking, with bounded internal state, over an open set of reference trajectories using a causal controller. By a causal controller we mean a controller which, at any time t, requires no more information about the reference trajectory yd () than a nite length vector [yd (t), yd (t), . . . , yd (t)]T . If we know all of yd (t) for t 0 in advance, we can, in some cases, determine an initial state of our control system such that under the assumption of exact output tracking, the state of the control system remains bounded. But in many cases of interest it is impractical to assume such knowledge. That we cannot achieve exact causal tracking of any output reference trajectory from an open set can be demonstrated as follows: Choose an arbitrarily small any such and supt0 [yd (t), y d (t), y d(t)]T 0, yd (t) = k k
2 (1) (k )

> 0. For

we can construct a C 2 reference trajectory yd (t) such that y d (t) 0 for all t 0

< . An example is

(tt1 ) 2

sin(2 (t t1 )) ,

1 4

t [0, t1]

t [t1 , t1 + 1] t t1 + 1

(6.2)

for t1 0. Dierentiate yd (t) to get 0, y d (t) = k sin2 ( (t t1 )) , 0, t [0, t1 ]

t t1 + 1

t [t1 , t1 + 1]

(6.3)

and dierentiate y d (t) to get

If k < / , then (6.1) holds. The signals yd (t), y d (t), and y d (t) are graphed in Figure 6.2 for k = 1/(2 ), where k is suciently small for = 1.

0, y d (t) = 2 k (sin( (t t1 )) cos( (t t1 ))) , 0,

t [0, t1 ]

t [t1 , t1 + 1] t t1 + 1

(6.4)

Sec. 6.1

Introduction

137

0.2

yd
0 0

0.5

1.5

2.5

t
0.2

d/dt yd

0.1 0 0

0.5

1.5

2.5

t
0.5

d2 /dt2 yd

0 -0.5

0.5

1.5

2.5

t
Figure 6.2: A reference trajectory yd (t) such that y d (t) 0 for all t 0, and such that T supt0 [yd (t), y d(t), y d (t)] < . For this graph t1 = 1, k = 1/(2 ). In general we assume t1 > 0 is unknown. Now assume that the cart tracks yd (t) (6.2) exactly, but that we do not know t1 in advance. Such tracking is easily accomplished by setting the input u to be u=y d (t) + 1 (x y d (t)) 0 (x yd (t)) (6.5)

and setting y (0) = yd (0) and y (0) = y d (0), with 1 > 0 and 2 > 0. The cart travels at t t1 + 1 the ball remains at rest at its zero conguration (, ) = (0, 0) relative to the constant velocity for all t t1 + 1. Thus if (t1 + 1) = 0 and (t1 + 1) = 0, then for all

cart. There may or may not exist a solution such that the ball starts on the cart at time t = 0 with some initial velocity, and ends up at (, ) = 0 at time t = t1 + 1 without falling o the cart. If a solution does not exist, then obviously there is no initial condition at which we can set the ball so that ((t1 + 1), (t1 + 1)) = (0, 0). If a solution does exist, then for each choice of t1 , that solution is unique. Thus the propitious initial conditions for the ball are unique for each value of t1 . But without knowledge of t1 we cannot determine this

138

Approximate Output Tracking

Chap. 6

unique propitious initial condition, and every other initial condition will cause the ball to eventually fall o of the cart because the ball does not end up at its zero state (, ) = (0, 0) at time t1 + 1 (after which the cart travels at constant velocity). Therefore, we cannot, in a causal manner, achieve exact tracking while keeping the ball on the cart. Since we can make as small as we wish and still dene yd of the form (6.2), it follows that there is no > 0 for which we can achieve exact tracking on all yd (t) satisfying (6.1).

6.1.2

The Inversion Problem for Nonlinear Systems


Given a control system, the inversion problem is the problem of determining a

feasible state trajectory such that output resulting from that state trajectory is a preassigned output reference trajectory yd (t). A feasible state trajectory whose corresponding output is the given output reference is referred to as the inverse corresponding to yd . We seek to construct a causal controller for balance systems with the property that the closed-loop system, consisting of balance system and controller, has output-bounded internal dynamics while providing output tracking. In the present chapter, however, we wish to have the ability to make the bound on the internal state of the closed-loop system arbitrarily small so that we can satisfy any bound on the internal state. This will require a bound on the reference output yd of the closed-loop system, as well as on its derivatives.

6.1.3

How Dynamic Inversion Will Be Used


We will use dynamic inversion (see Chapters 2 and 3) to produce an estimate of an

implicitly dened function of state called the internal equilibrium angle, e . We will use a variation of the derivative estimator algorithm, Algorithm 4.3.1, in order to obtain estimates of the derivatives of e with respect to time. Both e and its derivative estimates will then be incorporated into a tracking control law as the internal part of an approximate inverse trajectory. By causing the internal state (e.g. (, )) to approximately track (e , e ), the proposed control law will allow approximate tracking with bounded internal dynamics.

6.1.4

Previous Work
The problem of output tracking for linear time-invariant systems was solved by

Francis [Fra77]. Isidori and Byrnes [IB90] generalized the result of Francis to the timeinvariant nonlinear case. Both results give asymptotic tracking of any member in a family of signals generated by time-invariant autonomous dynamic systems. Though the linear

Sec. 6.1

Introduction

139

problem may be solved by solving a set of linear matrix equations, the nonlinear problem requires the solution of a non-trivial set of partial dierential equations. Huang and Rugh [HR92a, HR92b] and Krener [Kre92] have studied conditions under which the Byrnes and Isidori equations are solvable. Tornamb` e [Tor91] presented a controller for single-input single-output feedback linearizable systems using singular perturbation to stabilize a state trajectory compatible with exact tracking. Gurumoorthy and Sanders [GS93] used a singular perturbation approach to stabilize bounded and known state trajectories which yield the desired output exactly. Tracking of exosystem generated signals using sliding mode control was studied by Gopolswamy and Hedrick [GH93]. In both [GS93] and [GH93] a reversal of a local approximation of the internal dynamics vector eld was used in order to approximate the internal part of the inverse trajectory. Such a strategy is akin to the use of dynamic inversion in this chapter, though considerably more local.

Hauser, Sastry, and Meyer [HSM92] studied controllers for non-minimum phase systems for which the transfer function of the linearization has a zero in the right-half plane, but close to the imaginary axis. Approximate feedback linearization was then used for tracking and regulation, by approximating the slightly nonminimum-phase system by one that was minimum phase and feedback linearizable.

Devasia, Paden, and Chen [DPC94] have introduced an iterative method of determining a bounded inverse for a class of nonlinear systems when such a bounded inverse exists. Once the inverse is found one can resort to conventional tracking controllers to stabilize the inverse trajectory [DP94]. In the case of nonminimum-phase systems the inverse constructed in [DPC94] is non-causal in that one must either start the system from the conditions of the system. Once the solution is determined one may set the initial conditions of the control system to match the predetermined solution, then use conventional tracking control techniques to stabilize the bounded trajectory. Hunt, Meyer, and Su [HMS94] have also presented constructive methods for nding a bounded inverse compatible with a desired output. DiBenedetto and Lucibello [BL93] have studied the case where a known solution exists and one is free to choose initial conditions. Again, this is a non-causal solution since we must know the complete history of the output reference trajectory in order to make the correct choice of initial condition. origin at t = , where the inverse is assumed to be 0, or one must preset the initial

140

Approximate Output Tracking

Chap. 6

6.1.5

Dierences in Our Approach


The approach presented in this chapter diers from previous approaches to tracking

for nonlinear non-minimum phase systems in a number of ways. i. Unlike [IB90], [GH93], [Fra77], [HR92a], and [GS93], we do not rely upon an autonomous exosystem to produce the output trajectory we wish to track. We are motivated to avoid the autonomous exosystem assumption by the realization that in the problem of controlling many nonlinear systems, vehicles for instance, reference trajectories do not originate in autonomous dynamic systems. ii. We do not assume that a bounded internal trajectory exists under exact output tracking conditions. iii. Unlike Tornamb` e [Tor91] we do not assume that the control system is feedback linearizable. We do, however, make other weaker assumptions regarding the structure of a partially feedback linearized system. iv. Our approach is to construct a submanifold of the state-space, called the internal equilibrium manifold, whose geometry depends upon a choice of output error dynamics. By making an open neighborhood of that manifold attractive and invariant, approximate1 output tracking with bounded internal dynamics is achieved. In fact Grizzle, Di Benedetto, and Lamnabhi-Lagarrigue [GBLL94] have shown that exact causal tracking for nonminimum-phase systems is impossible2 . In the present case, there is no inverse state-trajectory, corresponding to exact tracking, in the internal equilibrium manifold, but there is an approximate inverse in a neighborhood of the manifold; our controller renders that neighborhood invariant. Tracking the inverse solution corresponds to approximate output tracking. v. Unlike Francis [Fra77], Devasia et al [DPC94], and Hunt et al [HMS94] we do not construct a particular control to drive the output along a desired reference trajectory and superimpose a feedback control to stabilize the particular solution. Rather we construct a control that, in a sense, stabilizes a desired output error dynamics.
By approximate tracking is meant tracking with bounded error, where the bound on the error depends on a norm of the reference trajectory, dened below (6.57). 2 Strictly speaking, what Grizzle, et al. [GBLL94] showed was that given a nonminimum-phase analytic control system, there does not exist an analytic compensator which provides exact tracking of an open set of trajectories, while maintaining internal stability.
1

Sec. 6.1

Introduction

141

vi. Unlike [DPC94] the control scheme presented in this chapter is causal in the case of nonminimum-phase systems, and solves problems of both stabilization and trajectory generation simultaneously. We do not rely upon knowledge of a bounded inverse. Instead, exactness of tracking is sacriced in favor of assuring boundedness of a solution without the need to assume a priori knowledge of the solution. Our approach also helps to elucidate some of the limitations on the output tracking task for nonminimumphase systems and the role of geometrical features in those limitations. Though the controller presented provides only approximate tracking, it is approximate tracking of an open set of trajectories. There are many applications in which boundedness of a solution is more important than exactness of tracking. In particular the present work has been inspired by the problem of controlling a bicycle [Get94, Get95, GM95a], examined in Chapter 7, for which balance must be maintained at all times, and where perfect path-tracking accuracy may be seen as being somewhat less important.

6.1.6

Main Results
The main results of this chapter are

We exhibit a new class of systems called external/internal convertible systems3 such dynamics are made external, by a choice of input coordinate change and a new choice of output. We present a subclass of nonminimum-phase external/internal convertible systems tion at the origin. We introduce a manifold associated with the internal dynamics of partially feedback linearizable systems called an internal equilibrium manifold. Controlling the state of the control system to a neighborhood of the internal equilibrium manifold corresponds to balancing the ball, on the cart-ball system above, while the cart tracks a desired reference trajectory.

that their external dynamics are converted to internal dynamics, and their internal

called balance systems which have unstable zero dynamics and controllable lineariza-

We describe a causal controller for balance systems which solves for an approximate
3

bounded inverse while simultaneously making a region around that bounded inverse

Systems of this type were rst introduced, albeit sloppily, in [GH95].

142

Approximate Output Tracking

Chap. 6

attractive and invariant. We also prove that under appropriate conditions the controller results in approximate tracking of a given reference trajectory and internal tions and notation) of a bound on the reference output and its derivatives. dynamics that are bounded above by a class-K function (see Appendix A for deni-

We state and prove a theorem on the stability of exponentially stable systems under ane perturbations.

6.1.7

Chapter Preview
In order to set the stage for a comparison of a linear controller and the somewhat

more complicated controller we will present in this chapter, we rst review, in Section 6.2, the Jacobian linearization4 of time-invariant nonlinear control systems. In Section 6.3 we describe more precisely the class of nonlinear systems under consideration, and the problem which we will solve. We will point out some structural features of the class, features which our controller exploits. In Section 6.4 we discuss output tracking control of our system class, ignoring for the moment unstable internal dynamics. In Section 6.5 we discuss the control of the internal dynamics ignoring, again for the moment, the behavior of the output of the system. In Section 6.6 we dene the internal equilibrium manifold, an intrinsic geometric structure associated the the class of control systems which we consider. The internal equilibrium manifold is associated with states for which the internal dynamics are in some sense balanced in the sense that one balances a broomstick while walking across a room. In Section 6.8 we propose a tracking controller based upon the internal equilibrium manifold. In Section 6.8 we show how dynamic inversion may be applied to the estimation of the internal equilibrium variables. In Section 6.9 we apply the tracking controller to the classical problem of controlling an inverted pendulum on a cart where the cart position is the output we wish to cause to track a desired trajectory. Finally, in Section 6.10 we simulate the controller applied to the inverted pendulum on a cart and demonstrate a signicant performance improvement over results obtained using a linear quadratic regulator approach [CD91a].
4 We use the term Jacobian linearization rather than linearization in order to distinguish between the linearization of a control system at a point in state-space, from feedback linearization where a state-dependent coordinate change renders a nonlinear control system linear from input to output.

Sec. 6.2

Jacobian Linearization and Regions of Attraction

143

6.2
6.2.1

Jacobian Linearization and Regions of Attraction


Motivation
In this chapter we will appeal to geometry as well as Jacobian linearization in

order to derive a controller for a class of nonminimum-phase systems. A step in our design process will involve choosing control parameters such the the Jacobian linearization of the controlled nonlinear system is exponentially stable at the origin. Thus, by design, stability in an arbitrarily small neighborhood of the origin will not be an issue. Our controller will be more complex than a linear controller, even though we will assume that the systems we deal with have controllable linearizations at the origin. Therefore, in order to justify the increase in complexity, we will show, through simulation and comparison, that the domain of attraction of the origin is notably larger than the domain of attraction for a standard linear (LQR) controller. In light of this comparison, a brief review of the role of Jacobian linearization in nonlinear control is appropriate.

6.2.2

The Role of Jacobian Linearization in Nonlinear Control


We review Jacobian linearization as applied to a time-invariant nonlinear system

of the form x = f (x, u) f (x, u) is C 2 in its arguments. Expand (6.6) with x Rn , u Rm , and with equilibrium x = 0, i.e. f (0, 0) = 0. We will assume that Assume a controller of the form u = Kx where K Rnm is as yet undetermined. x = f (x, Kx) in a Taylor expansion about (x, u) = (0, 0) giving f (0, 0) f (0, 0) x Kx + g (x). x u where g (x) Rn satises g (x) = O( x 2 ). Let x = A := (6.8) (6.7)

f (0, 0) f (0, 0) Rnn and B := Rnm (6.9) x u and ignore the O( x 2 ) term g (x) to get the linearized model of systems (6.6) about the origin x = 0, x = Ax BKx (6.10)

144

Approximate Output Tracking

Chap. 6

Assume that the pair (A, B ) is controllable. Through standard methods (see [CD91a]) we determine a K such that A BK has eigenvalues in Co (see Appendix A for notation), hence for x(0) suciently close to the origin, the solution x(t) of (6.7) with such a choice of K is guaranteed to obey x(t) 0 as t . If, in fact, g (x) 0 in (6.8), then the nonlinear system (6.6) is linear, and the origin is globally exponentially stable for any K such that (A BK ) Co . More typically, however, the term g (x) becomes large as x increases,

and the largest ball centered at the origin and contained in the domain of attraction of the origin is bounded.

6.2.3

Dierent Controllers Same Linearization


The form of the state feedback u = Kx is simple and convenient, easily realized

by computer. Linear state feedback is, however, only one of an innite number of functions we may choose for our controller, having the same linearization. For instance, let u (x) = Kx + u2 (x)

(6.11)

where u2 (x) = O( x 2 ). Then a Taylor expansion of x = f (x, u (x)) about (x, u) = (0, 0) is x = f (0, 0) f (0, 0) u (0) x x + h(x) x u x (6.12) = where h(x) Rn is O( x 2). f (0, 0) f (0, 0) x Kx + h(x) x u

Equation (6.12) is the same form as equation (6.8) and has the identical lineariza-

tion (6.10), where A and B are dened as in (6.9). Thus, as long as u (6.11) has the similarly for initial conditions close to the origin. linearization Kx, the systems x = f (x, Kx) and x = f (x, Kx + u2 (x)) will behave

6.2.4

Regions of Attraction
The nonlinear term u2 (x) in (6.11) may cause the size and shape of the region of

attraction5 of the origin of (6.12) to dier widely from the region of attraction of the origin of system (6.7) due to dierences between g (x) and h(x) (see (6.8) and (6.12)). For systems having vector relative degree (see Section C in the Appendix, or Isidori [Isi89], p. 235) in a neighborhood of the origin, and which are fully linearizable
The region of attraction of an asymptotically stable equilibrium is the set of all initial conditions whose corresponding solutions go to the equilibrium as t .
5

Sec. 6.3

Problem Description

145

(no internal dynamics) for instance, it is possible (see Appendix, Section C) to choose a state-dependent change of coordinates and a control u (x) so that h(x) = 0 locally, making the origin of (6.12) globally exponentially stable (as long as the coordinate change is valid globally). For most nonlinear systems, however, determination of the region of attraction is, in general, highly problematic. Only in rare instances can one be specic about the size of the region. Occasionally a Lyapunov function can be found which will give a conservative bound on the region of attraction for very simple systems. The problem of determining the region of attraction of equilibria in nonlinear systems has inspired the creation of a number of simulation-based tools which attempt to ease the burden [Kad94, PC89]. These tools work by integrating the dynamical system for a large set of initial conditions. This method is obviously approximate since regions of attraction can be notoriously complex, as in the case of chaotic systems [GH90]. The engineer must bring his or her experience to bear on the interpretation of such data. Computer tools are most eective for systems of two or three dimensions since the resulting region of attraction is easily visualized. In most cases however, one must rely upon simulation and physical understanding, as we will do here, in order to determine whether one controller provides a better region of attraction than another.

6.3

Problem Description

In this section we will rst dene a class of systems called external/internal convertible systems, which we sometimes call E/I convertible for short, and point out a number of properties of such systems. In Section 6.3.1 we dene the E/I convertible form. In Section 6.3.2 we describe a number of properties of E/I convertible systems. In Section 6.3.3 we discuss the Jacobian linearization of E/I convertible systems. In Section 6.3.4 we discuss the zero dynamics of E/I convertible systems. In Section 6.3.5 we show how partially linearizable nonlinear control systems may be put into E/I convertible form. Then in Section 6.3.6 balance systems will be dened as a subclass of E/I convertible systems. This will allow us, in Section 6.3.7 to give a precise statement of the tracking problem we wish to solve.

146

Approximate Output Tracking

Chap. 6

6.3.1

External/Internal Convertible Form


We will consider single-input, single-output, n-dimensional time-invariant nonlin-

ear control systems of the form (see Figure 6.3)

External/Internal Convertible System x i = xi+1 , i m 1 x m = u i = i+1 , i p 1 (u) p = f (x, ) + g (x, )u y = x1

(6.13)

with input u R, output y R, state (x, ), with x := (x1 , . . . , xm ) Rm and := the open ball Br Rn about the origin. The origin is assumed to be an equilibrium of the (1 , . . .p ) Rp , with n = m + p. The coordinates (x, ) are assumed to be dened on

system, thus f (0, 0) = 0.

Assumption 6.3.1 The functions f (x, ) and g (x, ) are C n in their arguments for all (x, ) in Br Rm+p . Assumption 6.3.2 The function g (x, ) is assumed to be nonzero for all (x, ) Br .

Denition 6.3.3 Systems of the form (6.13) satisfying Assumption 6.3.1 are called external/internal convertible systems. For convenience we will often refer to external/internal convertible systems as E/I convertible. Figure 6.3 gives a block diagram showing the structure of an E/I convertible system.

Sec. 6.3
xm

Problem Description
x2 x1 y

147

External Subsystem Internal Subsystem f (x, )+g(x, ) u

Figure 6.3: An external/internal convertible system.

6.3.2

Properties of E/I Convertible Systems


External/internal convertible systems have some useful structural features and

properties that we will now describe.

External and Internal Subsystems We will refer to x Rn as the external state of (u) (6.13), and to ext(u) x i = xi+1 , i m 1 (6.14)

x m = u

as the external subsystem of (u) (6.13). Note that the external state x(t) of (u) is completely determined by y (0,m1)(t), because xi = y (i1), i m (see Figure 6.4). Hence the external state is observable, and if y (t) 0 for all t, then x(t) 0 for all t.

u y(m)

xm y(m-1)

x2 y(1)

x1 y

Figure 6.4: The external subsystem ext(u) of (u) (see also Figure 6.3).

148

Approximate Output Tracking We will refer to Rp as the internal state of (u) (6.13) and to i = i+1 , i p 1

Chap. 6

int(x, u)

p = f (x, ) + g (x, )u

(6.15)

as the internal subsystem of (u) (see Figure 6.5).

f (x, )+g(x, ) u

Figure 6.5: The internal subsystem int(x, u) of (u) (see also Figure 6.3).

The E/I convertible form (6.13) may be constructed from int and ext as shown in Figure 6.6.

x
u

ext

(u)

int

(x, u)

Figure 6.6: The plant (u) reconstructed from the internal and external subsystems.

Sec. 6.3

Problem Description

149

The external state is [x1 , x2 ]T and the external subsystem is ext(u) x 1 = x2 x 2 = u

Example 6.3.4 Consider the control system x 1 = x2 x 2 = u (u) 1 = 2 2 = x + sin(1 ) cos(1 ) cos(x)u

(6.16)

(6.17)

The internal state is [1 , 2 ]T and the internal subsystem is int(x, u) 1 = 2 2 = x + sin(1 ) cos(1 ) cos(x)u (6.18)

The Dual Structure An E/I convertible system is convertible because with a simple state-dependent input coordinate change and a change of output, the internal system is converted to an external system, and the external system is converted to an internal system. The resulting system will be referred to as the dual of (u). Let u = g (x, )1 (v f (x, )) Apply transformation (6.19) d ( v ) to the E/I convertible system (6.13) to get x i = xi+1 , i m 1 (6.19) dene a state-dependent input transformation, u v . Dene = 1 as the dual output.

x m = g (x, )1f (x, ) + g (x, )1v p = v i = i+1 , i p 1 (6.20)

= 1

Thus the input transformation (6.19) combined with the output assignment = 1 con-

verts the internal dynamics of (u) to the external dynamics of d (v ), and the external dynamics of (u) to the internal dynamics of d (v ). This property is the origin of the term external/internal convertible.

150

Approximate Output Tracking

Chap. 6

6.3.3

The Linearization at the Origin


The Jacobian linearization of (u) (6.13) at the origin (x, ) = (0, 0) has the form x = A11 0 x + B1 B2
B

A21 A22
A

(6.21)

with B1 Rm1 , B2 Rp1 , A11 Rmm , A21 Rpm , and A22 Rpp dened by 0 1 0 . . . = 0 0 0 0 0 0 . . . . . . , A21 = 1 0 0 f (0, 0)/x 0 0 f (0, 0)/ 0 0 1 0 . . . 0 g (0, 0) 0 . . . 0

A11

(6.22)

A22

We may make a few observations regarding the linearization (6.21) of (6.13) at the origin. Observation 6.3.5 The form of the Jacobian linearization of an E/I convertible system neither precludes nor implies controllability at the origin. For example, x = u = x + + u is not controllable, while x = u = x+u is controllable. (6.26) (6.25)

0 . . . B1 = , 0 1

0 1 . . . = 0 0

(6.23)

and

B2 =

(6.24)

Sec. 6.3

Problem Description

151

Observation 6.3.6 If f (0, 0)/x = 0 and f (0, 0)/ = 0, then the linearization is not controllable. For example x = u = cu where c = 0 is a constant, is not controllable at the origin. Observation 6.3.7 If f and g in (u) (6.13) satisfy f /x 0 and g (0, 0) = 0, then the pair (A, B )(6.21) is not controllable. For example x = u = is not controllable. Since, when we dene balance systems in Section 6.3.6, we will assume controllability of the linearization (6.21) of (u) (6.13), we know that we may stabilize the origin by linear state-feedback. Well established tools exist for such stabilization, e.g. a linear quadratic regulator (LQR) (see [CD91a]). Remark 6.3.8 As mentioned in Section 6.2.1 previously, the controller we will describe in this chapter is more complex than a linear controller. We will show that the trade-o for such complexity is a substantial increase in the region of attraction of the origin in the case of regulation, along with a substantial increase in performance in the case of tracking. Our scheme will depend upon n parameters with the requirement that the linearization of the resulting closed loop system is stable at the origin. In fact later, for comparison purposes, we will make the linearization of the controlled subsystem identical to a linear quadratic regulator. (6.27)

(6.28)

6.3.4

The Zero Dynamics


For many single-input, single-output systems the relative degree of the output y

is less than the system dimension n. For such systems we may dene zero dynamics, as discussed in Chapter 4. The stability or instability of those zero dynamics is a critical factor in determining the sort of control that may be used for tracking. The zero dynamics of the E/I convertible system (u) (6.13) are

152

Approximate Output Tracking

Chap. 6

Zero-Dynamics of the E/I Convertible Form i = i+1 , i p 1 (6.29)

p = f (0, )

obtained by restricting the input u and the output y of (u) (6.13) to be identically zero (see Figure 6.7), or equivalently obtained as ext(0, 0) (see Equation (6.14)).

f (0, )

Figure 6.7: The zero dynamics of (u). Balance systems, will be dened below as having zero dynamics (6.29) that are unstable at = 0. The zero dynamics of the dual system d (v ) (6.20) are obtained by restricting 0 and v 0. The resulting system is the zero dynamics of d (v ), Zero-Dynamics of the Dual x i = xi+1 , i m 1 (6.30)

x m = g (x, 0)1f (x, 0) The stability or instability of the zero dynamics of d (v ) are independent of the stability or instability of the zero dynamics of (u). This is demonstrated by the following example. Example 6.3.9 Special Cases: Internal Stability of Dual Systems. The linear system = u x = x++u y = x

(6.31)

Sec. 6.3 has zero dynamics

Problem Description

153

(6.32)

then |(t)| . The system dual to (6.31) = x + v x however, has the stable zero dynamics = v y = c

obtained by setting u 0 and y 0. This zero dynamics (6.32) is unstable; if (0) = 0

(6.33)

x = x. The system = u x = x + u y = x

(6.34)

(6.35)

which also has unstable zero dynamics.

has unstable zero dynamics and a dual system = x+v x = v y =


c

(6.36)

6.3.5

Conversion of Control Systems to External/Internal Convertible Form


Single-input single-output time-invariant nonlinear control systems of the form ( = f x x) + g ( x) u y = h( x) (6.37)

may be brought into the external/internal convertible form (6.13) in a number of ways. For instance, suppose the output y has relative degree m (see Appendix, Section C, or Isidori [Isi89], Chapter 4). Suppose also that there exists another function ( x) having relative degree p and such that
p1 m1 x) ( x) := [L0 h, L0 h, . . . , Lf , . . ., Lf ]( f f

(6.38)

154

Approximate Output Tracking

Chap. 6

mation x ( x) along with the input transformation x) + u Lm h( f u = n1 Lg x) L h(


f

is a dieomorphism in a neighborhood of each x Br Rn . Then the coordinate transfor-

(6.39)

brings (6.37) to the convertible form of (u) given in (6.13). Many underactuated mechanical systems6 may be easily brought to convertible form (6.13) as shown by the following example. Example 6.3.10 Conversion of a Mechanical Systems to External/Internal Convertible Form. We consider a multi-input, multi-output, underactuated mechanical system in order to suggest to the reader how the standard form (6.13) is generalized to multiinput, multi-output systems. ated mechanical system Let q 1 Rn1 , and q 2 Rn2 , with 1 Rn1 . Consider the second order underactuM11 (q ) M12 (q ) M21 (q ) M22 (q )
M (q )

q 1 q 2

F 1 (q, q ) F 2 (q, q )

1 0

(6.40)

with output y = q 1 and generalized applied force 1 . The mass matrix M (q ) is assumed Rn2 n2 are also symmetric and positive-denite. Since M (q ) is nonsingular, a unique solution exists for q 2 in terms of q 1 , q 2 , q 1, q 2 , and . Consider the state-dependent input transformation which generates 1 from a new input u, 1 = M11 (q )u + M12 (q ) q 2 F 1 (q, q ) Substitute this expression for 1 into the rst n1 equations of (6.40) to get q 1 = u Substitute (6.42) into the last n2 equations of (6.40) and solve for q 2 to get
1 1 q 2 = M22 (q )F 2 (q, q ) M22 (q )M21 (q )u

positive-denite and symmetric, hence the submatrices M11 (q ) Rn1 n1 and M22 (q )

(6.41)

(6.42)

(6.43)

which, when substituted into (6.41) gives


1 1 1 = M11 (q ) M12 (q )M22 (q )M21 (q ) u + M12 (q )M22 (q )F 2 (q, q ) F 1 (q, q )
6

(6.44)

A mechanical system is underactuated when there are fewer generalized forces than generalized coordinates.

Sec. 6.3 Let

Problem Description

155

x1 = q 1 , x2 = q 1 , 1 = q 2 , 2 = q 2 Our mechanical system now takes the form

(6.45)

x 1 x 2 1 2 with output y = x1 .

= x2 = u = 2 = M22 (q )1 F 2 (q, q ) M22 (q )1 M21 (q )u (6.46)

If M21 (q ) is full rank for all q in a neighborhood of the origin, then (6.46) is an E/I convertible system in that neighborhood. Spong [Spo96] calls this full rank condition on M21 (q ) strong inertial coupling. Special Case: For the special case of n1 = n2 = 1, (6.46) is in single-input, single-output, E/I convertible form (6.13).

6.3.6

Balance Systems
In this subsection we dene balance systems as a special class of E/I convertible

systems by imposing a number of assumptions on the E/I convertible form.

Assumption 6.3.11 The pair (A, B ) from the Jacobian linearization (6.21) of (u) (6.13) at (x, ) = (0, 0) is controllable. Let 2 denote [2 , . . . , p]T . Let (x; 1, 2 ) denote (x, ) with 1 R1 and 2

Rp1 .

Assumption 6.3.12 Assume that f (0; 0, 0) > 0 1 (6.47)

156

Approximate Output Tracking

Chap. 6

Under Assumption 6.3.12 an E/I convertible system is non-minimum phase. Consider the equation f (x; 1 , 0) + g (x; 1, 0)v = 0 (6.48)

Divide both sides of (6.48) by g (x; 1, 0), which is never zero by Assumption 6.3.2, subtract v , and multiply by 1 to get g (x; 1 , 0)f (x; 1, 0)f (x; 1, 0) = v Assumption 6.3.13 Assume that for all (x; 1, 0) Br g (x; 1, 0)1 f (x; 1 , 0) = 0 1 i.e. 1 v is strictly monotone. The signicance of Assumption 6.3.13 relates to the existence of an internal equilibrium manifold as will be explained in Section 6.6 below. Denition 6.3.14 An external/internal convertible system which satises Assumptions 6.3.11, 6.3.12, and 6.3.13 will be referred to as a balance system. Example 6.3.15 A Balance System. x 1 = x 2 = = y = Consider the E/I convertible form system x2 u sin() + x1 + u x1 (6.51) (6.50) (6.49)

whose linearization at the origin is 0 1 0 x 1 2 = x 0 0 0 0 0 1


A

x1

x2 + 1 u 1
B

(6.52)

It is easily veried that (B, A) is a controllable pair. For system (6.51) f (x, ) = sin() + x1 , g (x, ) = 1 (6.53)

Sec. 6.3 Thus

Problem Description

157

f (0, 0) =1>0 so Assumption 6.3.12 is satised. Also g (x; )1 f (x, ) = sin() + x1 = cos()

(6.54)

(6.55)

is nonzero for all (/2, /2), so Assumption 6.3.13 is satised. Thus (6.51) is a balance system for (x, ) B/2 .

6.3.7

The Regulation and Tracking Problems


In this subsection we dene the tracking problem , with the regulation problem as

a special case. For n, m Z+ , m n, and y () C m [0, ) let y (n,m) := [y (n), y (n+1), . . . , y (m)]T Let the class of output reference trajectories yd () be C n on [0, ), with the norm yd and the corresponding open ball B(n) = yd , | y (0,n)
(0,n)

(6.56)

= sup yd
t0

(0,n)

(t)

(6.57)

<

(6.58)

Let C (v ) denote the compensator for a balance system (u) (6.13) dened by C (v ) z = a(x, , z, v ) u = b(x, , z, v ) (6.59)

smooth vector elds with a(0, 0, 0, 0) = 0, and b(0, 0, 0, 0) = 0. Let [C (v ), (u)] denote the interconnection of C (v ) and (u) as illustrated in Figure 6.8.

with x Rm , Rp, z Rq , v Rk , u Rm , and where a(x, , z, v ) and b(x, , z, v ) are

v
z

C(v)

(u )
x,

(x,)

Figure 6.8: The interconnection of plant (u) and compensator C (v ).

158

Approximate Output Tracking We will concern ourselves with the following problem.

Chap. 6

Problem 6.3.16 Asymptotic Approximate Tracking. Let y be the output of [C (v ), (u)]. Assume yd B the origin,
(n)

and b : Rm Rp Rq Rk Rm , such that for all (x(0), (0), z (0)) in a neighborhood of i. limt y (t) yd (t) of , < 1 ( ) where 1 ( ) is a class-K function (see Appendix A) < 2 ( ) for all t 0,

. Find integer q 0, and smooth functions a : Rm Rp Rq Rk Rq

ii. for a class-K function of , 2 ( ), if (0) < 2 ( ), then (t)

iii. the equilibrium (0, 0) of [C (v ), (u)] is locally exponentially stable, iv. ((t), z (t)) is bounded on R+ . Denition 6.3.17 The Regulation Problem. The regulation problem is the tracking problem, Problem 6.3.16, for yd 0. The dynamic part of the compensator C (v ) that will be described will turn out to be a dynamic inverter.

6.3.8

A Comment on Normal Form


Single-input single-output nonlinear autonomous control systems of the form (6.37)

may always be brought into a normal form (see Isidori [Isi89], page 152) in which the input u does not appear in the internal subsystem. This is a result of the fact that the input vector eld g (6.37) is one-dimensional and therefore constitutes an involutive vector eld dimensional distribution (x), tangent to the embedded submanifold, such that g is not in (x). By changing the vector eld basis in which the system is expressed to be a union of g together with a smooth basis of (x), the normal form is achieved. Since form (6.13) is a special case of form (6.37), we may also bring (6.13) to normal form. However, the control methodology we will develop in this chapter is not conned to single-input singleoutput systems. Single-input single-output systems have only been chosen for simplicity of exposition. We will, in fact, rely upon the appearance of u in the internal subsystem. In the multi-input multi-output case, a normal form, in the sense of Isidori [Isi89], exists only distribution. Thus there exists an embedded submanifold of the state space Rn , and an n 1

Sec. 6.4

Controlling the External Subsystem

159

for the restricted class of control systems having involutive input distribution. If the input distribution is indeed involutive, our method is unaected by this. But by not relying upon involutivity we retain the utility of our control scheme over a much wider variety of systems.

6.4

Controlling the External Subsystem


In this section we describe a tracking controller for the external subsystem ext(u)

(see Equation (6.14)) disregarding, for the moment, the evolution of the internal subsystem int(x, u), (6.15). This tracking controller will play a role in our nal control law.

6.4.1

The External Tracking Dynamics


We dene the external tracking controller by

External Tracking Controller uext = vext (m) yd (t) m i=1 (6.60) i ( xi


(i1) yd (t))

(0,m) vext(yd , x)

where i , i m are chosen so that the roots of the polynomial


m

sm +
i=1

i si

(6.61)

is in Co . Remark 6.4.1 The external tracking controller is dependent solely upon the external state x and on yd
(0,m)

(t). The evolution of the internal state is ignored.

In light of Remark 6.4.1 we cannot expect (uext), the system (6.13) with u := uext, by denition of unstable zero dynamics, (uext) is not internally stable. to be internally stable. For instance, if yd 0 and (u) has unstable zero dynamics, then Given a reference output yd (t) C n , the nominal external dynamics are

160

Approximate Output Tracking

Chap. 6

Nominal External Dynamics x2 x 1 . . . . . . ext(uext) : = x m 1 xm (m) (i1) i yd m (t) x m i=1 i x yd

(6.62)

We call the vector eld vf(ext(uext)) (see Appendix A for notation) the nominal external vector eld,

Nominal External Vector Field x2 . . . (0,m) Next(yd , x) = xm (m) (i1) i yd m (t) i=1 i x yd

(6.63)

The nominal external dynamics are the dynamics we would like the external system ext(u) (6.14) to obey if we could ignore the internal -dynamics. Example 6.4.2 External Tracking Controller and External Dynamics. Consider the system x 1 x 2 1 2 = x2 = u = 2 = f (x, ) + g (x, )u (6.64)

The external subsystem for this system is x 1 = x2 x 2 = u (6.65)

Sec. 6.5

Controlling the Internal Subsystem

161

Given yd (t) C 2 an external tracking controller (6.60) is uext = vext = y d (x2 y d ) (x1 yd ) and the nominal external dynamics (6.62) is x 1 = x2 x 2 = y d (t) (x2 y d ) (x1 yd ) (6.67) (6.66)

6.5

Controlling the Internal Subsystem


In this section we describe a controller for the internal subsystem int(u), disre-

garding, again for the moment, the evolution of the external subsystem ext(u) and thus the output y (t). Like the external tracking controller, the internal tracking controller will play a crucial role in our nal tracking control law.

6.5.1

The Internal Tracking Dynamics


We will associate with the internal subsystem int(x, u) the output = 1 , which

may be identied as the output of the dual system d (v ) (6.20). Given a reference output d (t) C p , the internal tracking controller is

Internal Tracking Controller uint(vint) = g (x, )1 (vint f (x, )) vint =


(p) d

(6.68)

p i=1

i (i (i1))

where i , i p, are chosen so that the roots of the polynomial


p

s have strictly negative real parts.

i si1
i=1

(6.69)

Given a reference output d (t), the nominal internal dynamics (see Figure 6.9) is int(x, uint(vint)) (see (6.15) and (6.68)),

162

Approximate Output Tracking

Chap. 6

Nominal Internal Dynamics 1 2 . . . . . . int(x, uint) = = p 1 p (p) (i1) i p d p ) i=1 i ( d

(6.70)

If we could ignore the external dynamics ext(uint) of (uint), the nominal internal dynamics (6.70) are the dynamics we would like the state of int(x, u) to obey in order for (t) to converge exponentially to d(t).

, 0 d

( p)

( p) d

i ( i (id 1) )

i=1

g(x, )-1 (v f (x, ))

int (u)

Figure 6.9: The internal tracking controller. Example 6.5.1 Internal Tracking Controller and Internal Dynamics. Consider the system (6.64) of Example 6.4.2. Assume d (t) is C 2 . An internal tracking controller for system (6.64) is uint = g (x, )1(f (x, ) vint) d 2(2 d ) (1 d) vint = and the nominal internal dynamics are 1 2 = 2 d 2(2 d ) (1 d) (6.72) (6.71)

Sec. 6.6

The Internal Equilibrium Manifold

163

Remark 6.5.2 In our nal tracking strategy we will construct a value of the signal d, which we will call e , that depends upon yd
(0,m)

(t), and the external state x. We will then use

a controller of a form similar to uint(vint) (6.68) in order to cause 1 to approximately track e . It will be shown in Section 6.7 that under appropriate conditions, if 1 approximately tracks e , then y approximately tracks yd (t).

6.6

The Internal Equilibrium Manifold


In this section we construct the internal equilibrium angle e , actually a func-

tion of x and a choice of external controller vext . We will use e and approximations of its time-derivatives in order to construct an approximate tracking controller for (u). The internal equilibrium angle7 e is constructed as part of the equilibrium solution of int(x, uext), the internal subsystem (6.15) with the external tracking controller (6.60) applied. The internal equilibrium equations, e , are

Internal Equilibrium Equations 0 = 2 . . . e 0 = p 0 = f (x, ) + g (x, )v (y (0,m)(t), x)


ext d

(6.73)

The rst p 1 equations of e dictate that for the equilibrium solution, 2 = . . . = p = 0 Again, let 2 := [2 , . . ., p ]T (6.75) (6.74)

last equation of e , and let (e , 0, . . . , 0) be denoted by (e , 0) to get


7

with = (1 , 2 ). Substitute the solution 2 = 0 of the rst p 1 equations of e into the

We use the term angle because in the case of the inverted pendulum on a cart, e corresponds to an angle of the pendulum.

164

Approximate Output Tracking

Chap. 6

Internal Equilibrium Angle 0 = f (x; e , 0) + g (x; e, 0)vext(yd


(0,m)

(6.76) (t), x)

where we have replaced 1 in (6.73) by e . We call the solution e of (6.76) the internal equilibrium angle. Assumption 6.3.13 together with Assumption 6.3.1 and the implicit function theorem imply that for all x in a neighborhood of the origin, there exists an invertible xdependent map v e (x, v ) such that f (x; e (x, v ), 0) + g (x; e(x, v ), 0)v = 0 We will abuse notation by writing e (yd
(0,m)

(6.77)

, x) = e (x, vext(yd

(0,m)

, x)). Now note (see (6.60)) Therefore the solution

(0,m) (0,m) that vext(0, 0) = 0 and vext(yd , x) is continuous in yd and x. (0 ,m ) e (y (0,m), x) to (6.76) exists for all yd and x suciently small.

Assumption 6.6.1 Assume that e (x, v ) is C p in x and v with all mixed partial derivatives up to order p bounded on Br . Remark 6.6.2 Note that

g (x; e , 0)1f (x; e , 0) = vext(yd

(0,m)

(t), x)

(6.78)

Equation (6.78) will play a key role later in the analysis of our nal controller, where (6.78) will reappear as the rst term in a Taylor expansion. Remark 6.6.3 Derivatives of the Internal Equilibrium Angle. Even though e is the solution to an equilibrium equation, it is not generally true that
d (i) dt e

= 0. This is

because the internal equilibrium equation e is dependent upon exogenous variables x and yd
(0,m)

(t) which are, themselves, functions of t.

Sec. 6.6 Remark 6.6.4 When yd

The Internal Equilibrium Manifold


(0,m)

165

0, e is the solution of
m

f (x; e , 0) g (x; e, 0) thus e is not identically zero when yd


(0,m)

i xi = 0
i=1

(6.79)

0.

Under certain regularity conditions, the implicit equations e (6.73) dene an mdimensional submanifold of the n-dimensional state-space of (u). More precisely we dene the internal equilibrium manifold as follows:

Internal Equilibrium Manifold E (t) = {(x, ) Br Rm Rp | =


1 (0,m) e (yd (t), x), 2

(6.80)
p

= ...= = 0

pair (yd , (u)) having the following properties:

Property 6.6.5 The manifold E (t) is an intrinsic geometric structure associated with the t-dependence. The internal equilibrium manifold is t-dependent, except in the regulation case where yd (t), i m, are identically zero. p.
(i)

Dimension. The internal equilibrium manifold is of dimension m and codimension Graph Property. The internal equilibrium manifold is a t-dependent graph over the m-dimensional ex -subspace of the state space of ext(u). Independent of Applied Input. The denition of E (t) is independent of the input u applied to (u), though it is dependent on vext (yd
(0,m)

(t), x).

Regulation Case. When yd (t) 0, (x, ) = (0, 0) is in E (t). Special Case. If f and g are independent of x, then the level sets of e (yd
(0,m)

, x)

are m 1-dimensional time-varying hyperplanes embedded in the m-dimensional xvalue of vext. Figure 6.10 illustrates this for the case of x R2 and yd
(0,2)

subspace of the state space Rn . In this case e may be regarded as a function of the 0.

166

Approximate Output Tracking

Chap. 6

e x2

x1

v ext

Figure 6.10: When f (x, ) and g (x, ) are independent of x, then e may be regarded as a time-varying function of vext.

6.6.1

Derivatives Along the Internal Equilibrium Manifold


As mentioned above, we may regard E (t) as being a t-dependent graph over the m-

dimensional x-subspace Rm of ext(u). Consider a smooth vector eld N : R+ Rm Rm ; (t, x) N (t, x). Thus for each t we have a vector eld N (t, ) in Rm . Let h(t, x) be

a smooth real-valued function of t and x. The ith Lie derivative (or directional derivative) of h(t, x) along N (t, x), holding t xed and evaluating at (t, x), is denoted Li N h(t, x),
i and is dened recursively by LN h(t, x) = L1 N h(t, x) = dh(t, x) N (t, x), and LN h(t, x) = i h(t, x) by LN (Li1 (t, x)h(t, x)), with L0 h(t, x) = h(t, x). Dene L N N N

N h(t, x) := LN h(t, x) + h L t i h(t, x) Similarly dene L N i i1 L N h(t, x) := LN LN h(t, x)

(6.81)

(6.82)

i h(t, x) is the ith derivative with respect to t of the function h(t, x) along the Thus L N solutions of x = N (t, x). The ith derivative of e (yd i e . then L
N (0,m)

, x) along the vector eld N (t, x) is


(0,m)

Recall the nominal external vector eld Next(yd i e as an approximator for dii e . we will use L Next dt

, x) (6.63). In the next section

Sec. 6.7

Approximate Tracking

167

6.7

Approximate Tracking
In this section we will combine the results of the previous sections in order to
(n)

construct a controller for approximate output tracking of yd (t) B and i e L Next for d in the internal tracking controller (6.68).
(i)

The internal equilibrium controller, is constructed by substituting e for d,

Internal Equilibrium Controller ue = uint(ve ) = g (x, )1(f (x, ) + ve ) p e p1 i (i L (i1) e ) ve = L


Next i=1 Next

(6.83)

6.7.1

Error Coordinates
It will be convenient to use error coordinates in our analysis of the stability of

(ue ). Let ei =
i ei x = x yd (i1)

(i) i e ,

ip

i m1

(6.84)

p T m T 1 with the external error ex := [e1 x , . . . , ex ] and the internal error e := [e , . . . , e ] . p T m 1 Let e := [e1 x , . . . , ex , e , . . ., e ] . Note particularly that e may be regarded as a function (m) (0,m)

of yd

and ex , or of yd

and x.

6.7.2

Analysis of the Internal Equilibrium Controller


We will show that the system (ue ), dened as system (u) (6.13) with the input

ue (6.83), may be regarded as an exponentially stable system under an ane perturbation (see Denition (6.7.1) below). It will be seen that the internal equilibrium controller approximately decouples the dynamics of ex and e . Later in Section 6.8 we will bring in dynamic inversion in order to estimate e . For the purposes of the present analysis, however, we assume that e is available explicitly.

168

Approximate Output Tracking Insert ue (6.83) into (u) x i x m (ue (ve )) i p (6.13) to get = xi+1 , i m 1 = i+1 , i p

Chap. 6

= g (x, )1f (x, ) + g (x, )1ve = ve

(6.85)

where ve is given by (6.83). Expand ve as p e ve = L Next = e + where = e


(p) (p)

p i i1 i=1 i ( LNext e ) (i1) p i p e ) + (L i=1 i ( e Next (0,n) p i , e) i=1 i e + p (yd

e )

(p)

p i=1

(i1) i1 e e i (L ) Next

(6.86)
(0,n) p (yd (t), e)

= O(

(0,n) yd (t)

, e ).

Note that g (x, )1f (x, ) = g (x; e, 0)1 f (x; e, 0) + qx (yd where qx (yd
(0,n) (0,n)

(t), e)

(6.87)

(t), e) = O( yd

(0,n)

(t) , e ). Substituting from (6.78) gives


(0,m)

g (x, )1 f (x, ) = vext(yd = = It follows that ue = yd = where


(m) (m) (m) yd (m) yd

(t), e) + qx (yd ( xi

(0,n)

(t), e) (6.88)

m i=1 m i=1

(i1) (0,n) i yd ) + qx (yd , e) (0,n) i ei (t), e) x + qx (yd

= yd

(m) yd

(i1) (0,n) m i ) + qx (yd , e) + i=1 i (x yd (i1) (0,n) m i ) + px (yd , e) i=1 i (x yd (0,n) m i (t), e) i=1 i ex + px (yd

g (x, )1ve (6.89)

px (yd

(0,n)

(t), e) = qx (yd

(0,n)

(t), e) + g (x, )1 ve (t) , e )

= O( yd

(0,n)

(6.90)

Thus we may regard (ue ) as x i = xi+1 , i m 1 (m) (i1) (0,n) x i m = yd m (t)) + px (yd (t), e) i=1 i (x yd i = i+1 , i p 1 (p) (i1) (0,n) p p = e (i e ) + p (y (t), e)
i=1 i d

(6.91)

Sec. 6.7

Approximate Tracking external error coordinates (6.84) ex


(i+1)

169

System (6.92) is in the form of a decoupled and exponentially stable (for suitable choice of i s and i s) error system with an added perturbation. Let p(yd
(0,n) T

or, equivalently using internal and ei x = em = x ei = ep =

m i=1

, i m1

i ei x + px (yd

(0,n)

(t), e)

+1 ei , p1 (0,n) i p (t), e) i=1 i e + p (yd

(6.92)

, e) := 0, . . . , 0, px(yd

(0,n)

(t), e), 0, . . ., 0, p(yd


(0,n)

(0,n)

(t), e)

(6.93)
(0,n)

be the perturbation vector of (6.92). The perturbation p(yd


(0,n) (0,n)

(t), e) is Lipschitz in (yd

(t), e)

by our smoothness assumptions. Therefore, there exist k1 > 0 and k2 > 0 such that p(yd (t), e)

k1 + k2 e

k1 yd

(t)

+ k2 e

(6.94)

Denition 6.7.1 Ane Perturbation. We call a perturbation of the structure (6.94), i.e. a perturbation whose norm is bounded by a constant plus a term linear in the norm of the state, an ane perturbation. Assumption 6.7.2 Assume that the i , and i have been chosen such that when yd 0,

the origin of (u), as well as the origin of the nominal error dynamics, i+1 e i x = ex , i m 1 e m = m x i=1 iex i+1 e i = e , i p 1 p e p ei =
i=1 i

(6.95)

are exponentially stable.

Remark 6.7.3 Stability of the origin of (6.95) does not imply stability of the origin of (ue ) because (6.95) is not, in general, the Jacobian linearization of (ue ). Instead it is a convenient approximation of the Jacobian linearization with which to construct approximators for e , i p, and study boundedness of the output error.
(i)

170

Approximate Output Tracking

Chap. 6

In practice Assumption 6.7.2 is easy to enforce; one simply chooses positive values of i and i that make the Jacobian linearization of (ue ) exponentially stable at the origin. Then one checks that under these choices of i and i the origin of the (linear) nominal error dynamics (6.95) is exponentially stable. In fact we may, and will in the inverted Jacobian linearization of (ue ) is identical to that of the linearization of (K [xT , T ]T ) where K R1n is the gain matrix of a linear quadratic regulator. Now we make the following claim. pendulum example of Section 6.9 below, choose i , i p and i, i m, such that the

Proposition 6.7.4 Convergence for the Internal Equilibrium Controller. Assume yd B


(n)

for

0. If re > 0 and k2 0 (see (6.94)) are suciently small real numbers,

then there exists a t1 0, and a class-K function b( ) such that for all (ex (0), e(0)) Bre , (ex (t), e(t)) converges toward zero exponentially with a rate > 0 until (ex (t1 ), e(t1 )) enters Bb( ) . Once (ex (t), e(t)) enters Bb(
)

it remains in Bb(

thereafter.

Remark 6.7.5 A consequence of Proposition 6.7.4 is that the tracking error y (t) yd (t) is uniformly ultimately bounded (see Denition B.6.1 of Appendix B) as is the error 1 e .

For brevity, and since

is assumed xed, we will refer to b( ) as b.

The proof of Proposition 6.7.4 will follow from Theorem 6.7.6 below. Roughly, Theorem 6.7.6 states that under suitable conditions, state trajectories of exponentially stable systems subject to ane perturbations converge exponentially toward the origin up until a time t1 when the solution enters a bounded neighborhood of the origin. After time t1 the solution remains forever within the bounded neighborhood. Theorem 6.7.6 Exponentially Stable Systems Under Ane Perturbations. Consider the perturbed system x = f (t, x) + g (t, x) (6.96)

with x Br Rn , f : R+ Br Rn , and g : R+ Br Rn . Assume that f and g are stable equilibrium of the nominal system x = f (t, x).

piecewise continuous in t and locally Lipschitz in x. Assume that x = 0 is an exponentially

(6.97)

Sec. 6.7

Approximate Tracking

171

ex

0 t1

b t re

T T Figure 6.11: The internal equilibrium controller causes the error [eT x , e ] to converge toward 0 exponentially until it reaches the ball Bb Rn . See Proposition 6.7.4.

satises

There exists a Lyapunov function V : R+ Br R for (6.97) which, for some ci , i 4,


2 c1 x 2 2 V (t, x) c2 x 2 V V + f (t, x) c3 x 2 2 t x V c4 x 2 x

(6.98) (6.99) (6.100)

for all x Br . Assume that there exist p1 > 0, p2 > 0, and (0, 1) with p1 < such that for all x Br , ( c 3 p2 c 4 ) r c4 (6.101)

g (t, x) p1 + p2 x

(6.102)

solutions of (6.96) satisfy

If p2 < c3 /c4 , then for each x(t0 ) Br , there exists a t1 0 and a > 0 such that the x(t) k x(t0 ) e (tt0 ) , t t1 and x(t) b, t t1 where k= c2 , c1 = (1 )(c3 p2 c4 ) , 2c2 b= c4 r c 3 p2 c 4 c1 c2 (6.104) (6.103)

(6.105)

172

Approximate Output Tracking

Chap. 6

Proof of Theorem 6.7.6:

The proof of Theorem 6.7.6 is a variation of classical results

on perturbed systems (see Khalil [Kha92], Section 4.5, page 191). By Theorem B.5.1 of use V (t, x) as a Lyapunov function candidate for the perturbed system (6.7.6). Dierentiate V (t, x) along solutions of (6.96), and use (6.100) and (6.102) to get (t, x) c3 x V
2 2 2 2

Appendix B, a V (, ) and constants ci , i 4 satisfying (6.98), (6.99), and (6.100) exist. We

V x

2 2 2

g (t, x) + p1 c 4 x

(c3 p2 c4 ) x

c3 x

+ c 4 x 2 ( p1 + p2 x 2 )
2

(6.106)

Let c5 := c3 p2 c4 . Since we assume that p2 = c3 /c4 , cr > 0. Therefore (t, x) c5 x V For (0, 1), If x then (t, x) (1 )c5 x V But (6.109) holds if x By assumption
2 2 2 2 2 2 2

+ p1 c 4 x

(6.107)

(t, x) (1 )c5 x V

2 2

2 2

+ p1 c 4 x

(6.108)

+ p1 c 4 x

(6.109)

(6.110)

p1 c 4 p1 c 4 = c5 ( c 3 p2 c 4 ) p1 c 4 ( c 3 p2 c 4 )

(6.111)

r>

(6.112)

so there is a subset S of Br such that x S implies (6.111). Now application of Theorem B.6.2 of Appendix B completes the proof. We may now prove Proposition 6.7.4. Proof of Proposition 6.7.4: We may regard (u) as being of the form e x e Ax 0 0 A ex e

+ p(yd

(0,n)

(t), e)

(6.113)

Sec. 6.8 where, for yd B


(n)

Estimation of the Internal Equilibrium Angle and e < re , p(yd


(0,n)

173

k2 > 0. By choice of i and i , Ax and A are Hurwitz. Consequently e x e = Ax 0 0 A ex e

(t), e) k1 + k2 e for some k1 > 0 and

(6.114)

is exponentially stable. Therefore by Theorem B.5.1 of Appendix B there exist ci > 0, i 4, the open interval (0, 1). If k2 is suciently small, then there exists a k1max such that k1max = ( c 3 p2 c 4 ) r e . c4

and a Lyapunov function V () satisfying (6.98), (6.99), and (6.100). Let be a number in

(6.115)

Then if k1 k1max , by Theorem 6.7.6 e = (ex , e) converges toward 0 until it arrives in Bb Rn where b= c4 c 3 p2 c 4 c1 re c2 (6.116)

after which time it remains in Bb . Since b is linear in , b is a class-K function of . Thus e(t) is uniformly ultimately bounded by b. This implies that limt y (t) yd (t)
(i)

< b.

By assumption e , i p is bounded on BY Br . Since e is uniformly ultimately bounded, so is . Using the internal equilibrium controller requires that yd be C n because e pends on
(n) yd . (p)

de-

There is nothing intrinsic to the form of our controller which allows us to and re using a

guarantee that a suciently small k2 (see (6.94)) exists for each problem one might encounter. If necessary one could estimate an upper bound on k2 for a given computer. Even when yd
(0,m1)

assure exponential stability. By choosing i, i p and i , i m such that (ue ) has a

0, there are still conditions on k2 required in order to

stable linearization, however, we know that for an arbitrarily small region about the origin, the origin in exponentially stable. So in the worst case, we have good regulation in a neighborhood of the origin irrespective of k2 .

6.8

Estimation of the Internal Equilibrium Angle


If one assumes that a method exists with which to solve equation (6.76) for e , the

internal equilibrium method of Section 6.7 stands on its own without the need to introduce

174

Approximate Output Tracking

Chap. 6

dynamic inversion. Dynamic inversion is well suited, however, for obtaining an estimate of e from the implicit equation, Equation (6.76), that denes it. By substituting the estimator i) i e , we obtain estimators for ( e for e in L e . Approximation errors in estimating e and
N

its derivatives e may be regarded as additional terms in the perturbation p(yd The implicit equation to be solved is again 0 = f (x; e, 0) + g (x; e, 0)vext

(i)

(0,n)

(t), e).

(6.117)

where vext is given by (6.60). We obtain an estimator for e as follows. Dierentiate (6.117) with respect to t to get 0 =
f (x;e,0) x;e,0) x;e ,0) x + f ( + g( vext 1 1 x ;e ,0) + g(x xv ext + g (x; e, 0)v ext x

(6.118)

where v ext = d dt yd
(m) m

i ei x
i=1

= yd

(m+1)

m(ue yd )

(m)

m1 i=1

+1 i ei x

(6.119)

and x = [x2 , . . . , xm , ue]T . Therefore e =


f (x;e,0) 1

g (x;e ,0) vext 1

f (x;e,0) x x

g (x;e,0) xv ext x

+ g (x; e, 0)v ext

(6.120)

and the estimator E for e is obtained by substituting e for e in the expression above, E (x; e, 2 ; t) :=
f (x; e,0) 1

g (x; e ,0) vext 1

f (x; e,0) x x

g (x; e,0) xv ext x

+ g (x; e, 0)v ext

(6.121)

Let F ( e , x, t) = f (x; e , 0, . . ., 0) + g (x; e, 0, . . ., 0)vext(x, t) A dynamic inverter for e is now e = sign(D1 F ( e , x, t)) (f (x; e, 0) + g (x; e, 0)vext) + E (x; e, 2 ; t) (6.123) (6.122)

where E (x; e, 2 ; t) is as in (6.121). In the multi-input, multi-output context, where e is a vector, we may use the dynamic inverter of Theorem 2.4.6 to obtain an estimate for e .

Sec. 6.9

Tracking for the Inverted Pendulum on a Cart

175

Remark 6.8.1 An Alternate Derivative Estimator. Rather than using the E (x, e, 2 , t) Next e as a derivative estimator for as dened by Equation (6.121) above, we can also use L e . This gives a less accurate estimate of e than E (x; e, 2 ; t), but is often considerably simpler to compute. We will use this simpler estimator in Chapter 7 when we apply internal equilibrium control to the control of a bicycle. The dynamic inverter (6.123) provides the dynamic part of an internal equilibrium controller. Thus the dimension of the dynamic inverter chosen is the number q of Problem 6.3.16. For the single input, single output case, using the dynamic inverter (6.123), q = 1.

6.9

Tracking for the Inverted Pendulum on a Cart


The classical control problem of controlling the inverted pendulum on a cart [Kai80]

will be used to illustrate both the problem in which we are interested as well as an application of the approximate tracking controller.

mp 1 l 1 x1 x2
Figure 6.12: Inverted pendulum on a cart. The cart and pendulum system is illustrated in Figure 6.12. The position of the cart is parameterized by x1 R, the linear velocity of the pendulum pivot by x2 = x 1, the angle of the pendulum away from upright by 1 (/2, /2) S 1 , and the angular

2 g mc

velocity of the pendulum by 2 = 1 . The mass of the point-mass pendulum is mp, the length of the pendulum is l which we will set to equal 1 below, and the mass of the cart is mc . The gravitational acceleration is g .

176

Approximate Output Tracking The Lagrangian for the cart-pendulum system is L = mp gl cos(1 ) + 1 2 l mp(2 )2 + (mc + mp )(x2 )2 + mp l cos(1 )2 x2 2

Chap. 6

(6.124)

The Euler-Lagrange equations of motion are mp + mc lmp cos(1 ) lmp cos(1 ) l 2 mp x 2 2 = lmp( 1 )2 sin(1 ) glmp sin(1 ) + 1 0 (6.125)

We will follow Example 6.3.10 in converting (6.125) to E/I convertible form. Apply the input transformation of Example 6.3.10 1 = (mp sin(1 )2 + mc )u mp sin(1 )(g cos(1 ) l (2 )2 ) to make the dynamics of the cart and x 1 x 2 (u) 1 2 (6.126)

pendulum system take the E/I convertible form = x2 = u = 2 = g sin(1 ) cos(1 )u (6.127)

where we have set l = 1.

The external tracking controller (6.60) is vext(yd


(0,2)

(t), x) := y d (t) 2 (x2 y d (t)) 1 (x1 yd (t))

(6.128)

where 1 and 2 are real numbers chosen such that r 2 + 2 r + 1 = 0 has roots with strictly negative real parts. satisfying the system of equations e , e 0 = 2 0 = g sin(1 ) cos(1 )vext(yd
(0,2)

The internal equilibrium manifold for the pendulum is the set E (t) of all (ex , )

(t), x).

(6.129)

The internal equilibrium angle e for the inverted pendulum is the solution to g sin(e ) cos(e )vext(yd
(0,2)

(t), x) = 0

(6.130)

Though this equation has an explicit solution for e , namely e = arctan vext(yd
(0,2)

(t), x)

(6.131)

Sec. 6.9

Tracking for the Inverted Pendulum on a Cart

177

we will use dynamic inversion in order to track an estimate of e . Note that in this case, since the constraint equations (6.129) depend on x only through vext(yd
(0,2)

vext resides; i.e. the equations (6.129) describe a relation between and vext which does not

(t), x), we could think of E (t) as a xed graph over the space R in which

depend on x or t (see Figure 6.10 and Property 6.6.5, item 6.6.5). For generality we will the ex phase plane, where vext is dened by (6.128), are parallel lines. For the inverted pendulum problem we have vext(yd
(0,2)

ignore this, though our observation is reected in the property that the level sets of E (t) in

Next vext(y (0,3), x) L d 2 vext(y (0,4)(t), x) L Next d

(t), x) = y d (t) 2 (x2 y d (t)) 1 (x1 yd (t))


(3) (2) d d

= yd (t) 2 (vext yd (t)) 1 (x2 yd (t)) (4) Next vext y (3)(t)) 1 (vext y (2)(t)) = y (t) 2 (L
d

(1)

(6.132)

N e , and L 2 e . To obtain For our nal control law, (6.137) below, we need L ext Next these, dierentiate the second equation of (6.129) along the vector eld Next := e1 x
1 2 e2 x 1 ex

(6.133)

N e and L 2 e to obtain and solve for L ext Next Next e = (g cos(e ) + sin(e )vext)1 cos(e )L Next vext L 2 e = (g cos(e ) + sin(e )vext)1 (g sin( N 2 L e ) + cos( )vext) L ext e Next Next vextL Next e cos( 2 vext. +2 sin( e )L e )L
Next

(6.134)

The internal tracking controller (6.68) for the cart-pendulum system is uint (vint) = (g sin(1 ) + vint )/ cos(1 ) where d 2 (2 d ) 1 (1 d ) vint = for the cart and pendulum is then ue (ve ) = uint (ve ) = g sin(1 ) + v / cos(1 ) 2 e 2 (2 L N e ) 1 (1 e ) ve = L
Next
ext

(6.135)

(6.136)

where d (t) is a desired C 2 trajectory for 1 . The internal equilibrium control law (6.83)

(6.137)

Again, adjustment of the i s and i s must be made so that linearization of (ue ) at the origin is stable when u = ue (ve ) and yd 0, with the polynomials s2 + 2 s + 1 and s2 + 2 s + 1 having roots with strictly negative real parts.

178

Approximate Output Tracking From (6.121) we use the estimator for e

Chap. 6

E (x; 1, 2 , t) = g cos(1 ) + sin(1 )vext

cos(1 ) yd 2 (ue y d ) 1 (x2 y d )

(3)

(6.138)

The dynamic inverter for approximating e is


(0,2) e ) cos( e )uext(yd , x)) + E (x; e, 2 , t) e = (g sin(

(6.139)

6.9.1

An Intuitive Description of the Internal Equilibrium Controller


The internal equilibrium controller for the inverted pendulum on a cart may be

2 viewed as follows: For each (e1 x , ex ) and each time t there corresponds a value of the ac-

celeration of the cart x 1 = u which, disregarding internal dynamics, would make the cart track yd (t) according to the nominal external dynamics. For each such value of u, there corresponds an angle e of the pendulum such that if u were held constant in time at that value, that angle e would be an (unstable) equilibrium. The internal equilibrium controller strategy is to stabilize a region about that equilibrium e and to follow e as it changes due to changes in yd (t), i 2 as well as motion of the cart. Note that in the case of output reg2 2 for the task of regulation to the origin. In this case x1 = e1 x and x = ex . At the top of (i)

ulation (yd 0), e depends only on x. The situation is shown schematically in Figure 6.13

the gure x1 is represented by the location of the pendulum pivot, and x2 is represented by the arrow below the pivot. By the conversion to standard form, we have essentially removed the cart dynamics and have assigned an input which is equal to the acceleration of the cart. Thus we do not show the cart in the drawing; just the pendulum pivot whose acceleration we now control. The time sequence of the pendulum frames shown runs from left to right along the top row of pendula, and continues from right to left along the bottom row. A gray line shows the internal equilibrium angle e . The bottom part of the drawing
2 1 Two trajectories are shown in the phase space. The black trajectory is (e1 x , ex , ) while the

shows the error dynamics phase plane as well as the internal equilibrium manifold E (t).

2 1 2 gray trajectory is (e1 x , ex , e ). The internal equilibrium controller steers (ex , ex , ) toward 2 1 2 1 1 2 (e1 x , ex , e ). When (ex , ex , ) is close to (ex , ex , e ), then the actual external vector eld

vf(ext(ue )) is close to the nominal external vector eld Next = vf(ext(vext)). Note that as moves towards E (t), the pendulum pivot moves away from the origin. Only when gets close to E (t) does the pivot start to move toward the origin. The labels a, b, c and d label various points of e , while the labels 1, 2, 3, and 4 label various points of the pendulum

state. The angles 1 and e become approximately equal at b, staying close as e goes from

Sec. 6.9

Tracking for the Inverted Pendulum on a Cart

179

negative to positive at c, and as ex heads into the origin.

180
a

Approximate Output Tracking

Chap. 6

x2
d 4 c 3 1

x1

Figure 6.13: Regulation of the inverted pendulum. The internal equilibrium manifold E (t) is outlined in bold gray in the lower graph. The actual (x1 , x2 , 1 ) trajectory of the pendulum is indicated in black, and its projection (x1 , x2 , e ) onto E (t) is shown in gray.

Sec. 6.10

Simulations

181

6.10

Simulations
In this section we compare simulation results for a linear quadratic regulator and

for the internal equilibrium controller we have presented. We apply both controllers to the problems of output regulation and tracking for the inverted pendulum on a cart, where the output is the cart position. We will demonstrate that our method is more eective than linear quadratic regulation. The linear quadratic regulator (see [CD91a] for review) is of the form x yd (t) x2 y d (t) u = K 1 2 where K is chosen to minimize 10 0 0 0 0 0 0 2 1 0 0 x 1 + (u)2 dt 0 1 0 2 0 0 1 x1
1

= K

e1 x e2 x 1 2

(6.140)

x1 x2 1 2

(6.141)

subject to the constraint that x 1 0 1 0 0 x1 0

x 2 0 0 0 0 1 = 0 0 0 1 2 0 0 g 0

x2 1 1 + u 0 2 1

(6.142)

matrix was calculated in Matlab [Mat92] to four decimal places. It is K= 3.1623 4.7378 43.0326 13.7789

Equation (6.142) is the Jacobian linearization of (6.127) for yd 0. The resulting gain

(6.143)

The gain coecients for the internal equilibrium controller were calculated to be such that the Jacobian linearization of (ue ) is identical to the Jacobian linearization of (K [x1 , x2 , 1 , 2 ]T ) at the origin. Thus ue / (x, ) = K . This leads to 1 = 2.1885, 2 = 1.35943, 1 = 33.2326 2 = 13.7789 (6.144)

182

Approximate Output Tracking

Chap. 6

Table 6.1: Initial conditions for regulation simulations. An asterisk * indicates that the corresponding initial conditions are in the region of attraction of the origin for the particular controller. x1 (0) 0 0 0 0 0 0 0 0 0 1 [m] 2 [m] 4 [m] 8 [m] 16 [m] 32 [m] 64 [m] 128 [m] x2 (0) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 (0) 100 200 300 400 500 600 700 800 850 0 0 0 0 0 0 0 0 2 (0) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 LQR * * * * * IE * * * * * * * * * * * * * * * *

* * * *

6.10.1

Regulation Results
Two regulation tasks were used in order to demonstrate the enhancement of the

region of attraction by the internal equilibrium regulator. In one task the initial conditions for the system were set so that x1 (0), x2 (0), and 2 (0) were zero, while the pendulum angle 1 (0) was incremented in increments of 100 from 100 to 800 , followed by a step of 50 to 1 (0) = 850 . In the second set x2 (0), 1 (0), and 2 (0) were set to zero, while x1 (0) was incremented from 10 meters to 90 meters. The chart in Table 6.1 shows a summary of the results. The linear quadratic regulator column is indicated with LQR and the internal equilibrium controller column is marked by IE. An asterisk * in the controllers column indicates that the initial conditions on the corresponding row were in the region of attraction of the origin for that controller. Figures 6.14 through 6.18 show results for various initial values for 1 (0) from of four graphs. The top graph shows the output y versus time, with the LQR output dashed, and the internal equilibrium regulator output solid. The left graph of the second row shows the angle 1 versus time, with the LQR 1 dashed, the internal equilibrium 100 through 850 , for the regulation problem (yd 0). Each of these gures is composed

Sec. 6.10

Simulations

183

regulator 1 (solid), and e (t) (dotted). The right graph of the second row shows the input ue (solid) to the internal equilibrium controller, as well as the input u (dashed) to the
2 over the e1 x , ex error plane. Three trajectory lines converge to the origin. One trajectory 2 1 1 2 is (e1 x (t), ex (t), (t)) which is shown in black. Another is (ex (t), ex (t), 0), the projection 2 1 of (e1 x (t), ex (t), (t)) onto the error phase-plane which is shown in light gray. A third is 2 1 1 2 the projection of (e1 x (t), ex (t), (t)) onto E (t), namely (ex (t), ex (t), e ). This trajectory is 2 1 LQR controller. The bottom graph in e1 x , ex , space shows the equilibrium manifold E (t)

shown in dark gray as is the outline of E (t).

In Figure 6.14 the time course of the cart position is virtually identical for both

the internal equilibrium controller as well as the LQR controller. The initial conditions are x1 (0) = 0, x2 (0) = 0, 1 (0) = 100 , 2 (0) = 0. The upper right graph shows convergence of
2 1 1 (solid) to e (dotted). The lower right plot in (e1 x , ex , ) space shows how the trajectory 2 1 (e1 x (t), ex (t), (t)) is attracted to E (t), moving rst away from the origin, but then as the

2 1 trajectory gets closer to E (t), (e1 x (t), ex (t), (t)) begins to close in on the origin. The

small compared to the curvature of E (t).

internal equilibrium manifold E (t) appears at at this view because the state trajectory is

In Figure 6.15 with initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 200 , 2 (0) = 0,

one can just begin to see a dierence between the response of the IE controller and the LQR controller in the three time plots, as well as the curvature of E (t). In Figure 6.16 with initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 500 , 2 (0) = 0,

the responses of the two controlled systems have diverged substantially. Both, however, perform the regulation function. In Figure 6.17 with initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 600 , 2 (0) = 0, the LQR controlled system has failed in that 1 has exceeded 900 at approximately t = 1.5 seconds. The IE controller, however, continues to provide regulation as 1 approximately tracks e . The last trial of this set, Figure 6.18 is for an initial pendulum angle of 850 and mate tracking of e by 1 is apparent as is regulation to the origin.
2 1 shows how the IE regulator pulls the trajectory (e1 x (t), ex (t), (t)) down to E (t). Approxi-

184

Approximate Output Tracking

Chap. 6

0.6

0.4

0.2

10

8
0.2

7
0.15

6
0.1

5 4 3

0.05

u
2 1
-0.1

-0.05

0
-0.15

-1 -2

-0.2

10

10

e2 x

e1 x

Figure 6.14: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 100 , 2 1 2 2 (0) = 0. Since yd 0, (e1 x , ex ) = (x , x ).

Sec. 6.10

Simulations

185

1.2

0.8

0.6

0.4

0.2

-0.2

10

0.4

16 14

0.3

12
0.2

10 8 6

0.1

u
4 2 0 -2 -4 0

-0.1

-0.2

-0.3

-0.4

10

10

e2 x

e1 x

Figure 6.15: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 200 , 2 (0) = 0.

186

Approximate Output Tracking

Chap. 6

4 3.5 3 2.5 2 1.5 1 0.5 0 -0.5

10

1 0.8 0.6

60

50

40
0.4

30
0.2 0

20

-0.2 -0.4

u
10 0 -10
-1

-0.6 -0.8

10

-20

10

e2 x

e1 x

Figure 6.16: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 500 , 2 (0) = 0.

Sec. 6.10
8

Simulations

187

y
-2 -4 -6 -8 0

5
100

10

1.5 50 1 0

0.5

u
-50 -100 1 2 3 4 5 6 7 8 9 10 -150 0

-0.5

-1

-1.5

10

e2 x

e1 x

Figure 6.17: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 600 , 2 (0) = 0.

188

Approximate Output Tracking

Chap. 6

50

40

30

20

10

y
0 -10 -20 -30 0

5
1000

10

1.5

800

600

0.5

400

u
1 2 3 4 5 6 7 8 9 10

200

-0.5

-1

-200

-1.5

-400

10

e2 x

e1 x

Figure 6.18: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, 1 (0) = 850 , 2 (0) = 0.

Sec. 6.10

Simulations

189

The second set of regulation simulations shown in Figures 6.19 through 6.22 corresponds to setting x2 (0) = 0, 1 (0) = 0, and 2 (0) = 0, while stepping x1 (0) through the sequence 1, 8, 16, and 64 (in meters). The arrangement of graphs is the same as for Figures 6.14 through 6.18. Figure 6.19 corresponds to initial conditions x1 (0) = 1, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. Again, the response of the IE and LQR controlled systems are virtually iden2 1 tical. The small scale of the trajectory in (e1 x , ex , ) space again makes E (t) appear at.

Figure 6.20 corresponds to initial conditions x1 (0) = 8, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. The response of the two controllers has diverged substantially, but both perform their regulation functions.

Figure 6.21 shows failure of the LQR regulator as the pendulum angle exceeds /2 after approximately 0.5 seconds. The IE controller continues to regulate properly as 1 is attracted to e . Figure 6.22 shows successful regulation using the IE controller for initial conditions x (0) = 64, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0. The curvature of E (t) can be clearly seen as
2 1 can its attractiveness to (e1 x (t), ex (t), (t)). 1

190

Approximate Output Tracking

Chap. 6

1.2

0.8

0.6

0.4

0.2

-0.2

10

4
0.05

-0.05

-0.1

-0.15

u
0
-0.2

-1
-0.25

-0.3

10

-2

10

e2 x

e1 x

Figure 6.19: Regulation trial for initial conditions x1 (0) = 1, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0.

Sec. 6.10

Simulations

191

9 8 7 6 5 4

y
3 2 1 0 -1 0

5
30 25 20

10

0.6 0.4 0.2

15
0

10
-0.2

-0.4 -0.6 -0.8 -1 -1.2

u
0 -5 -10 -15 -20 0

10

10

e2 x

e1 x

Figure 6.20: Regulation trial for initial conditions x1 (0) = 8, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0.

192

Approximate Output Tracking

Chap. 6

18 16 14 12 10 8

y
6 4 2 0 -2 0

5
60 40

10

0.5 20 0 0 -20 -0.5

-40 -60 -80

-1

-1.5 -100 -2 -120

10

10

e2 x

e1 x

Figure 6.21: Regulation trial for initial conditions x1 (0) = 16, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0.

Sec. 6.10

Simulations

193

80

60

40

20

y
0 -20 -40 0

5
300

10

1.5

200

100 0.5 0 0 -100

-0.5

u
-200 -300 -400 0 1 2 3 4 5 6 7 8 9 10 -500 0

-1

-1.5

-2

10

e2 x

e1 x

Figure 6.22: Regulation trial for initial conditions x1 (0) = 64, x2 (0) = 0, 1 (0) = 0, 2 (0) = 0.

194

Approximate Output Tracking

Chap. 6

6.10.2

Tracking Results
The last group of graphs shows the results for tracking a sinusoidal trajectory. For

each simulation the initial conditions were x(0) = 0, (0) = 0, and e (0) = 0. Figure 6.23 shows tracking results for both the internal equilibrium controller and the LQR controller tracking yd = sin(2 0.1t). Figure 6.24 shows tracking results for yd = sin(2 0.2t). Figure 6.25 shows tracking results for yd = sin(2 0.5t). Each of Figures 6.23, 6.24, and 6.25 contain six graphs consisting of two sets of three graphs each. The set of the three upper graphs correspond to the internal equilibrium controller. The set of the three lower-left graphs correspond to the LQR controller. The top left graph of each set shows the desired output yd (t) (dashed), the actual (simulated) set shows the actual (simulated) pendulum angle 1 (solid). For the IE set the internal graph of each set show the corresponding intputs u. For the graph of Figure 6.25, the LQR the LQR set for that gure includes data only up untill that time. In all cases observed the performance of the IE controller exceeds that of the LQR controller with respect to the tracking error, the peak IE tracking error being approximately one quarter of the LQR tracking error. Figure 6.25 shows failure of the LQR controller to track, the pendulum angle exceeding /2 at approximately 2.2 seconds. Though the tracking error for the IE controller has increased with respect to Figure 6.24, the pendulum angle remains conned to the interval (/2, /2). equilibrium angle e (dashed), and the error e (dotted) are also shown. The right output y (solid), and the tracking error y yd (dotted). The bottom left graphs of each

controller caused the angle to exceed /2 after a time of approximately 2.1 seconds, so

Sec. 6.10

Simulations

195

1 0.5 0

1.5 1 0.5 0

-0.5 -1

10

12

14

16

18

20

-0.5

0.2 0.1 0

ue
2 4 6 8 10 12 14 16 18 20

-1

-1.5 -2 -2.5 -3

-0.1 -0.2

10

12

14

16

18

20

2 1 0

2 1.5 1 0.5 0

-1 -2

10

12

14

16

18

20

-0.5

0.1 0.05 0

u
-1 -1.5 -2 -2.5 -3

-0.05 -0.1

10

12

14

16

18

20

10

12

14

16

18

20

Figure 6.23: Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin(0.2 t), a 0.1 Hz sinusoid.

196

Approximate Output Tracking

Chap. 6

1 0.5 0

2 1.5 1 0.5

-0.5 -1

10

12

14

16

18

20

0.3 0.2 0.1

ue
2 4 6 8 10 12 14 16 18 20

-0.5 -1 -1.5 -2 -2.5

-0.1 -0.2 0

10

12

14

16

18

20

4 2 0

4 3 2 1

-2

0
-4 0 2 4 6 8

10

12

14

16

18

20

-1

0.4 0.2 0

-0.2 -0.4

u
-2 -3 -4 -5 -6 0

10

12

14

16

18

20

10

12

14

16

18

20

Figure 6.24: Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin(0.4 t), a 0.2 Hz sinusoid.

Sec. 6.10

Simulations

197

4 2 0 -2 -4 -6 0 1 2 3 4 5 6 7 8 9 10

40

30

20

10

0.5

-0.5 -1

ue
-10 -20 -30 -40 0

10

10

4 2 0

80

60

-2
40

-4 -6 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2


20

u
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

4 2 0

-20

-2 -4 -6 0

-40

-60

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Figure 6.25: Tracking trial for initial conditions x(0) = 0, (0) = 0, with yd (t) = sin( t), a 0.5 Hz sinusoid.

198

Approximate Output Tracking

Chap. 6

6.11

Discussion

Modications of the Internal Equilibrium Controller. The internal equilibrium controller may be modied by setting the higher order Lie derivatives in ue (6.83) to zero. This results in a higher bound on the norm of the perturbation term p(yd suers as a result; holding the bound on
(0,n) yd (0,n)

, e). Performance

the same as above, the invariant and on yd


(0,n)

attractive neighborhood of E (t) becomes larger, resulting in a larger ultimate bound on the tracking error. For systems in which the bound

is suciently small, this

modication can pay o by reducing computational load.

Modications of the Dynamic Inverter. There is latitude for choice in the construction of the dynamic inverter for approximating e . As mentioned above in Remark 6.8.1, we Next e to estimate could use L e rather than using (6.121). This is often a computationally simpler choice than (6.121) for the following reason: Recall that ue (ve ) (6.83) is dependent on yd (t), x, and . Since vext is dependent on xm , v ext is dependent on ue (ve ) and (0,n) Next vext gives L Next e which is hence on yd (t), x, and . Replacing v ext in (6.121) with L dependent only on yd
(0,m) (0,n)

(t) and x.

Strict Bounds on Internal Trajectories. When strict bounds are required on (t) , within the strict bounds. It is quite possible for E (t) to obey the bounds, while the attractive and invariant neighborhood of E (t) exceeds the bounds. conditions must be such that the attractive and invariant neighborhood of E (t) remains

Extension to Time-Varying Systems. We have applied the internal equilibrium controller to time-invariant systems. This was for simplicity of exposition only. The internal equilibrium controller may applied to time-varying systems in a similar manner, the only changes in the above arguments being smoothness requirements on f (t, x, ) and g (t, x, ).

Extension to Multi-Input, Multi-Output Systems. Again, for simplicity of exposition, we have conned ourselves to single-input, single-output systems. Extension to multi-input, multi-output systems is straightforward. The internal equilibrium becomes a are inherited, as above, from the conditions of the implicit function theorem. Details of this extension will appear in future work. vector valued quantity. Conditions on the existence of E (t) in a neighborhood of the origin

Sec. 6.12

Chapter Summary

199

6.12

Chapter Summary
We have introduced an internal equilibrium manifold E (t) and a controller that

makes a neighborhood of E (t) attractive and invariant. This has aorded approximate Comparison to the performance of a linear quadratic regulator in the case of the inverted

output tracking of reference output from an open set, while maintaining internal stability. pendulum and cart indicates a substantial increase in the region of attraction of the origin in both regulation and tracking tasks. We have shown that the ultimate bound on both the tracking error and the internal state is a class-K function of the bound on the output In the next chapter we will apply internal equilibrium control to tracking control

reference trajectory.

with balance for a nonholonomic model of a bicycle.

200

Chapter 7

Automatic Control of a Bicycle


7.1 Introduction
In this chapter we derive a controller which, using steering and rear-wheel torque as controls, causes a model of an automatically controlled riderless bicycle to approximately track a time-parameterized path in the (horizontal) ground plane while retaining its balance. The design of the controller utilizes the results on internal equilibrium control from Chapter 6 in a multi-input, multi-output context on a mechanical system with nonholonomic constraints. Control of the bicycle is a rich problem oering a number of considerable challenges of current research interest in the area of mechanics and robot control. As modeled here, the bicycle is an underactuated system, subject to nonholonomic contact constraints associated with the rolling constraints on the front and rear wheels. It is unstable (except under certain combinations of fork geometry and speed) when not controlled. It is also, when considered to traverse at ground, a system subject to symmetries; its Lagrangian and constraints are invariant with respect to translations and rotations in the ground plane. Though a number of researchers have studied the stability of bicycles and motorcycles under a linear model of rider control [NF72, Far92, Rol72, Sha71, Wei72] (See Hand [Han88] for a comprehensive survey), as far as we know the controller described here is the rst controller allowing approximate tracking of arbitrary trajectories while maintaining balance. Von Wissel and Nikoukhah [vWN95] have constructed a method of piecing together optimal trajectories to construct paths, even around obstacles, along which a bicycle may be stabilized in traveling from one point to another. Control of balance and roll-angle tracking for the bicycle model used in this chapter has been addressed in [Get94].

Sec. 7.2

The Model

201

This corresponds to internal tracking control (see Chapter 6, Section 6.5) which will be reviewed in Section 7.5 in the context of the bicycle problem. In addition to extending those results to tracking in the plane we also utilize some new results from Bloch, et al. [BKMM94] on the derivation and structure of the equations of motion for nonholonomic systems with symmetries. Another oering of this chapter is a simple, tractable model of the bicycle. Extant mathematical models of bicycles and motorcycles [Sha71, DN92, Rol72, FSR90, Far92, DN92, NDV92] have been designed with the intent of providing a complete physical description of the bicycle or motorcycle, capable of predicting a wide range of vehicle behavior including some troublesome oscillatory instabilities [Far92] thought to arise from dynamic interactions of frame-exibility, tire-road interactions, and rider control. Instead, the bicycle model oered here, and in [Get94, GM95a], is designed for control of the bicycle in an operating range that might be reasonably realized by an autonomous vehicle. Thus we make a tradeo of modeling errors, with respect to a more accurate model of a real automatic bicycle, for simplicity and tractability.

7.1.1

Chapter Overview
In Section 7.2 we describe the bicycle model, its generalized coordinates, and its

nonholonomic constraints. In Section 7.3 we derive equations of motion for the bicycle. Through a sequence of coordinate changes and assumptions we convert the equations of motion to the external/internal convertible form (6.13) of Chapter 6. In Section 7.4 we construct an external tracking controller for the bicycle. In Section 7.5 we describe the internal tracking controller for the bicycle and show some interesting paths in the plane which result from its application. In Section 7.6 we dene an internal equilibrium angle for the bicycle. In Section 7.7 we describe an internal equilibrium controller for the bicycle. Then in Section 7.8 we show the results of simulations of the internal equilibrium control of a bicycle following a variety of trajectories in the plane.

7.2

The Model
In this section we describe the bicycle model, its generalized coordinates, forces

on the bicycle, and its nonholonomic constraints.

202

Automatic Control of a Bicycle

Chap. 7

Force exerted on bicycle by the ground

Steering axis

c
r Contact Line Constrained directions of wheel travel

b
Figure 7.1: Side view of the bicycle model with = 0.

7.2.1

Assumptions on the Model


We consider a simplied bicycle model illustrated in Figure 7.1. The wheels of the

bicycle are considered to have negligible inertial moments, mass, radii, and width, and to roll with neither lateral nor longitudinal slip. The rigid frame of the bicycle is assumed to be symmetric about a plane containing the rear wheel. The bicycle is assumed to have a steering-axis xed in the bicycles plane of symmetry, and perpendicular to the at ground when the bicycle is upright. For simplicity we neglect the moments of inertial of the bicycle, i.e. we assume a point-mass bicycle.

7.2.2

Reference Frames and Generalized Coordinates


Consider a ground-xed inertial reference frame with x and y axes in the ground

plane and z -axis perpendicular to the ground plane in the direction opposite to the force of gravity as shown in Figure 7.2. This frame is called the ground frame. The intersection of the vehicles plane of symmetry with the ground plane forms its contact-line. The contactline is rotated about the z -direction by a yaw-angle, . The contact-line is considered directed, with its positive direction from the rear to the front of the bicycle. The yaw-angle is zero when the contact-line is parallel to the x-axis. The angle that the bicycles plane of symmetry makes with the vertical direction is the roll-angle, (/2, /2). Consider the line of intersection between the plane of the front wheel and the

Sec. 7.2

The Model

203

z y x c contact-line b
Figure 7.2: Bicycle model rolled away from upright by angle . In this gure is negative. ground plane. Let (/2, /2) be the steering-angle between this intersection and 1 tan(), b

m p

the contact-line as shown in Figure 7.2. We will parameterize the steering angle by :=

(7.1)

and refer to as the steering variable. We will see in Section 7.2.4 that use of rather than to parameterize steering will help to give the nonholonomic rolling constraints of the bicycle a very simple form. Remark 7.2.1 Note that the steering angle is not the angle of rotation of the steering shaft of the bicycle. Call the steering shaft angle (see Figure 7.3). Then and are related by tan() cos() = tan( ) and from (7.1) we have b cos() = tan( ) From (7.2), if = 0, then = . The two angles and are shown in Figure 7.3. (7.3) (7.2)

204

Automatic Control of a Bicycle

Chap. 7

Figure 7.3: Leaning bicycle showing the relationship between the steering angle and the steering shaft angle . Note that in the gure, the roll-angle is negative.

The coordinates x, y , , , and are a complete set of generalized coordinates for the bicycle.

Alternative Generalized Velocities Corresponding to the generalized coordinates is a set of generalized velocities x , y , , , and . When we introduce constraints in Subsection 7.2.4, it will be convenient to use an alternative set of generalized velocities which we now dene. Let vr be the component of the velocity of the rear-wheel contact along the contact-line as measured from the ground frame, and let v be the component of the velocity of the rear-wheel contact perpendicular to the contact line and parallel to the ground plane as measured from the ground frame. Both velocities are indicated in Figure 7.4. Note that vr , v , x , and y are related through a rotation by , x y as indicated in Figure 7.5. cos() sin() sin() cos() vr v

(7.4)

Sec. 7.2

The Model

205

z y x vr v

Figure 7.4: The bicycle model showing body velocities vr and v . Note that the roll-angle in the gure is negative.

v y

vr x x

Ground Frame

Figure 7.5: Top view of rear wheel showing the relationships among vr , v , x , and y .

The body frame of the bicycle is taken to have its origin at the rear-wheel contact point, with axes in the direction of vr , v as shown in Figure 7.4, with the third axis parallel to the z -axis of the ground frame. The velocity components vr and v are the velocity of the rear-wheel contact point relative to the ground frame expressed in the body frame as indicated in Figures 7.4 and 7.5. A complete alternate set of generalized velocities for the , and bicycle is then vr , v , .

206

Automatic Control of a Bicycle

Chap. 7

7.2.3

Inputs and Generalized Forces


Let r be the reaction force that the ground exerts on the bicycle at the rear-wheel

contact point (see Figure 7.1); this reaction force r acts along the contact-line as indicated in Figure 7.1. Another torque generator is associated with the steering variable . The corresponding generalized torque is . The bicycle is also subject to the force of gravity mg acting on the point mass m of the bicycle. Thus our bicycle is riderless and under automatic control, driven by two torque generators.

7.2.4

Constraints
As mentioned in Subsection 7.2.1 the front and rear wheels are assumed to roll

with neither lateral nor longitudinal slip. For simplicity we will deal directly with the component of the contact velocity vr along the the contact line (see Figure 7.2). Front and rear-wheel contacts are constrained to have velocities parallel to the lines of intersection of their respective wheel-planes and the ground-plane, but free to turn about an axis through the wheel/ground contact and parallel to the z -axis. The large arrows at the front-wheel and rear-wheel contacts in Figures 7.1 through 7.4 indicate the positive directions of the wheel/ground contact velocities. The generalized velocities of the bicycle are partitioned as r = [, vr , ]T and v ]T . In these velocity coordinates the nonholonomic constraints associated s = [, with the front and rear wheels, assumed to roll without slipping, are expressed simply by s + A(r, s)r = 0 or

v
s

Bicycle Constraints 0 0 + vr 0 0 0
A(r,s) r

=0

(7.5)

, of the bicycle is the product of the The rst equation of (7.5) says that the yaw-rate, steering variable, := 1 b tan , and the rear-wheel velocity vr , = vr . This follows by inspection of Figure 7.6.

Sec. 7.3

Equations of Motion

207

vr tan() vr y b x
Ground Frame

vr

Figure 7.6: Velocity geometry for constraints. The second equation tells us that v , the component of the velocity of the rearwheel contact point perpendicular to the plane of that wheel, is zero, i.e. no lateral slip of the rear wheel, hence v = 0. The linear map represented by matrix A(r, s) (7.5) connects the base velocities v ]T . If we know r r = [, vr , ]T to the ber1 velocities s = [, for all t, we can integrate to get , then from (7.5) we can reconstruct s for all t. Remark 7.2.2 Due to symmetries of the constraints with respect to translations and rotations in the plane, A(r, s) depends only on r . Remark 7.2.3 Assuming no wheel slip,
t

vr dt
0

(7.6)

is the length of the path of the bicycle at time t, where the path length at time t = 0 is assumed to be zero. The integral
0 t

v dt

(7.7)

is always zero since v is always zero.

7.3

Equations of Motion
In this section the reduced equations of motion for the bicycle will be derived

using the results of [BKMM94]. For an alternative, but equivalent derivations with the same results, derived using results from [BRM92], see [Get94].
1

For a description of this base-ber structure of nonholonomic systems with symmetry see [BKMM94]

208

Automatic Control of a Bicycle

Chap. 7

We choose a body-frame for the bicycle centered at the rear-wheel ground contact, with one axis pointing forward along the line of intersection of the rear wheel plane with the ground, another axis orthogonal to the rst and in the ground plane, and an axis normal to the ground, pointing in the direction opposite to gravity (see Figure 7.4). The body frame is a natural frame in which to write the Lagrangian of the bicycle for a number of reasons. In particular the rolling constraints take on the very simple form seen in (7.5). Let s := sin() and c := cos(). We will associate an unspecied, though presumably known, - and -dependent moment of inertia J (, ) of the front wheel about the steering axis, where the and dependence of J (, ) is inherited from the and dependence of the steering shaft angle (7.3). We will introduce an assumption in Subsection 7.3.1 below that the steering variable is directly controlled, i.e. an input, so the lack of commitment to a specic J (, ) will not be a problem and will allow us to consider a broader range of bicycle models than if we were to commit ourselves to a specic J (, ). This assumption is easy to justify in practice using a steering motor capable of high torque. Temporarily retaining the structure associated with J (, ) will, on the other hand, allow us to use a convenient formulation of the equations of motion [BKMM94]. Thus
1 the kinetic energy associated with the steering axis will be 2 J (, ) 2. The Lagrangian for

the bicycle is constructed from the kinetic energies associated with the point mass and the steering axis, as well as the potential energy of the point mass. It is (see Figures 7.2 and 7.4)

Bicycle Lagrangian L = mgpc + 1 2 2 J (, ) )2 + (v pc )2 + (ps + m (vr + ps + c )2


2

(7.8)

where m is the mass of the bicycle, considered for simplicity to be a point mass. Note that the constraints are not reected in the structure of L, though we have allowed them to inuence our choice of velocity coordinates. This is evidenced by the inclusion of v in (7.8) which the constraints dictate should be identically zero. Incorporating the constraints (7.5) into the Lagrangian we obtain the constrained Lagrangian for the bicycle

Sec. 7.3

Equations of Motion

209

Constrained Bicycle Lagrangian Lc = (gmpc) + 1 2+ 2 J (, )


m 2 ((vr

+ ps vr )2

(7.9)

+p2 s2 2 + (cvr pc )2 )

= vr and v = 0 into the unconstrained Lagrangian (7.8). Of obtained by substitution of course the equations of motion for the constrained Lagrangian are not Lagranges equations. The correct formulation of the equations of motion based upon the constrained Lagrangian are derived in [BKMM94] and shown to be equivalent to dAlemberts equations for constrained systems. They are2 d Lc Lc Lc L l j + Ak = l Cij r i i i k dt r r s s
b where C denote the components of the curvature of the connection A(r, s), l Cij = l Al Al Al j j k k Ai i + A A i j r j r i sk sk

(7.10)

(7.11)

When all generalized forces i correspond to so-called base velocities r i , Equations (7.10) is modied to be d Lc Lc Lc L l j + Ak = l Cij r + i (7.12) i i i k dt r r s s The resulting equations of motion (7.12) are reduced in number with respect to

the number of generalized coordinates. The reduced equations3 may be expressed as M (r ) r = K (r, r ) + B (7.13)

of M, K, and B are

where r = [, vr , ]T , M R33 , K R3 , B R32 , and u = [ r , ]T . The components p2 0 cpc 0 0 0


1 m J (, )

2 2 2 2 2 M (r ) = cpc 1 + c + 2ps + p s
2 Al

(7.14)

j Here we use the summation convention where, for example, if s is of dimension m, then Ak i sk

Al

k j m k=1 Ai sk . 3 The reduced equations of motion (7.13) do not include the structure of the momentum equations [BKMM94], a set of rst order dierential equations governing the eects of conserved momenta on the motion of the bicycle. In the case of the bicycle, such eects are governed entirely by the inputs.

210

Automatic Control of a Bicycle


2 gps + (1 + ps )pcvr + cpcvr + 1 2 2m J (, )

Chap. 7

K (r ) = cps 2 (c2 + ps (1 + ps))vr (1 + ps )2pcvr 1 2 2m J (, ) 0 0 1 B= m 0 1 0 m

(7.15)

(7.16)

7.3.1

Practical Simplications
We will further reduce our model through practical considerations. First, we will

assume that for the range of values of , , , and in which we will be interested, that the term
1 2, 2m J (, )

from the rst entry of K (r ) is negligible. As stated above we assume

that the steering variable is directly controlled, i.e. we assume that we have a steering actuator that will allow us to track any desired (t) trajectory exactly. For an automatically will be convenient to refer to as the input w , controlled bicycle this is a reasonable approximation as long as | | is suciently small. It = w (7.17)

After the above simplications, the equations of motion take on the simpler form

Reduced Bicycle Equations of Motion = w (r ) M v r (r ) + B (r ) = K w r (7.18)

where (, ) = M (, , K , vr ) = (, , vr ) = B p2 cpc 1 + (c2 cpc (7.19)

2 + p2 s2 ) + 2ps

2 gps + (1 + ps)pcvr

(1 + ps)2pcvr cps 2 cpcvr 0 (c2 + ps (1 + ps))vr 1/m

(7.20)

(7.21)

Sec. 7.3 with inputs w and r .

Equations of Motion

211

(, , vr) in (7.21) has vr as a multiplication factor Remark 7.3.1 The rst column of B conrming the intuitive notion that if vr = 0 then steering input w can have no aect on either or vr . Thus, as experience predicts, the bicycle is not controllable when vr = 0. Also, as vr gets closer to zero, the steering input w must get larger in order to maintain inuence over and vr . It is practical then to choose controls and initial conditions such that vr > vmin > 0 The model is equally valid for vr < 0, but we will only consider the vr > 0 case. (7.22)

7.3.2

Conversion to External/Internal Convertible Form


The equations (7.18) are not yet in the E/I convertible form we require for appli-

cation of the results of Chapter 6. We will convert (7.18) to the E/I convertible form (6.13) through a sequence of coordinate changes on both state and input. A. First we apply a state-dependent input transformation to ur which makes the dynamics of vr linear with respect to the input. The second equation of the reduced equations (7.18) is 21 22 v 2 + B 21 w + B 22 r M +M r = K Solve (7.23) for v r to get 1 M 21 2 + B 21 w + M 1 B 22 r v r = M +K 22 22 (7.24) (7.23)

Now dene a transformation from the input force r , to a new input variable w r with 22 B 1 M 1 M 21 2 + B 21 w + w r r = M +K 22 22 (7.25)

This transformation (7.25) is r , r , and w dependent. Substitute (7.25) into the equations of motion (7.18) to get v r = wr (7.26)

so that the inputs are now w r and w .

= w M 11 1 M 12 w r + B 11 w = K

212

Automatic Control of a Bicycle

Chap. 7

B. We consider the output of the bicycle to be the x, y location of the rear-wheel ground contact. We will connect outputs to inputs through dierentiation, with respect to t, of the nonholonomic constraints. From Figure 7.5 we can see that x and y are related to vr and by x y = vr c vr s (7.27)

Dierentiate (7.27) twice with respect to t to get x(3) y Dene new inputs ur := v r = w r, = v u := r + vr = w r + vr w (7.29)
(3)

vr c 2 2v r s vr s 2 2v r c

c vr s vr c

v r

(7.28)

Note that u and ur are related to w and w r through integration and a state dependent coordinate change. (Later in Section 7.6 we will see the reason for the order of the smoothness requirement on xd () and yd ()). takes the form Through (7.28) and the input transformations (7.29) above, the bicycle model Assume that we want (x, y ) to track (xd (t), yd(t)) where xd () and yd () are C 4

x(3) y (3)

= =

g p s

2v r s vr c 2v r c vr s +
1 p

c s

vr s vr c

ur u (7.30)

1+

ps vr

r + c cu c v p

Note that u is the only input directly entering the internal dynamics. C. In one last step we will achieve E/I convertible form for the bicycle. Dene new inputs ux and uy by ur u = c s 2v r s vr c 2v r c vr s + ux uy (7.31)

s /vr c /vr

This redenition of inputs converts (7.30) to a multi-input, multi-output version of the external/internal convertible form (6.13) for internal equilibrium control, where we group the and vr dynamics with the external dynamics.

Sec. 7.3

Equations of Motion

213

External/Internal Convertible Form for the Bicycle ext x(3) = ux uy


1 p

int

y (3) =g p s +
c + p c

1+

ps vr

r cv 2v r s vr c 2v r c vr s + ux uy

(7.32)

s /vr c

, , and v Note that and vr , and thus r , are functions of x(0,3) and y (0,3) through the relations x = vr cos(), y = vr sin(), vr = x 2 + y 2 (7.33)

Thus (7.32) is equivalent to the combination of reduced equations of motion (7.19) and constraints (7.5). Remark 7.3.2 A count of equations reveals that (7.32) consists of eight equations (we count z (k) = as k dierential equations) as compared to the six dierential equations of (7.26), plus two constraints (7.5). The extra two equations are due to the input transformation (7.29).

7.3.3

Internal Dynamics of the Bicycle


Because only u appears in the internal dynamics of (7.30), it will be more con-

venient for our purposes to work with the model in the form (7.30) rather than to use the E/I convertible form, where both ux and uy appear in the internal dynamics. The internal dynamics from (7.30) are the -dynamics,

Internal Dynamics of the Bicycle = g 1 s + p p 1+ ps vr r + c c u cv p (7.34)

214

Automatic Control of a Bicycle

Chap. 7

Zero-Dynamics of the Bicycle Recall that the zero dynamics of a control system is the reduced system obtained by restricting the output and input to be identically zero. If we were to choose the reference output (xd (t), yd(t)) = (0, 0) as our zero output, we would have no chance of stabilizing the bicycle about this output since, as pointed out in Remark 7.3.1, the bicycle is not controllable when vr = x 2 + y 2 = 0. Through changes of coordinates, however, there is a good deal of choice in what is meant by zeroing the output. For instance, let (xz (t), yz (t)) be a C 3 trajectory in R2 , and redene the output of the bicycle system to be x y We may then redene the input to be u x u y = ux xz (t) uy yz (t)
(3) (3)

x xz (t) y yz (t)

(7.35)

(7.36)

In this manner, any suciently smooth reference trajectory (xr (t), yr (t)) may be made to be the zero output. However, the class of such reference trajectories is divided between those reference trajectories resulting in autonomous zero dynamics, and those resulting in non-autonomous zero dynamics. For a zero output we will choose rectilinear motion along the x axis at a constant speed of vz > vrmin > 0. This keeps the zero dynamics autonomous while retaining controllability. If (x(t), y (t)) (vz t, 0), then ux 0, uy 0, vr vz , 0, 0. The resulting zero dynamics of the bicycle are and Zero-Dynamics of the Bicycle = g sin() b (7.37)

The zero dynamics of the bicycle are the equations of motion for a planar pendulum in a gravitational eld, where = 0 corresponds to the upright position of the pendulum. These are the same zero dynamics we would have obtained if we had naively let the zero output be (xz (t), yz (t)) (0, 0). We would, in fact, obtain the same zero dynamics if we had chosen our zero output to coincide with any inertial frame traveling parallel to the ground. This is a result of the invariance of the Lagrangian and the constraints under translations and rotations in the ground plane.

Sec. 7.5

Internal Tracking Controlle

215

7.4

External Tracking Controller


Assume we wish to track the output reference trajectory (xd (t), yd (t)) C 4 . Let s3 + 3 s2 + 2 s + 1 (7.38)

i , i 3, be such that

has roots with strictly negative real parts. Inputs ux and uy which, ignoring internal dynamics, will achieve this goal comprise an external tracking controller for the bicycle

External Tracking Controller for the Bicycle ux ext uy ext = xd (t) yd (t)
(3) (3) 3

i
i=1

x(i1) xd y (i1) yd

(i1)

(t) (t)

(i1)

=:

Vx Vy

(7.39)

Substitute ux ext and uext into (7.31) to get a corresponding value of the input u , which we

refer to as u ext , which produces the same tracking result (x, y ) (xd , yd ). u ext = 1 vr s c 2v r s vr c 2v r c vr s + ux ext uext
y

(7.40)

The nominal external dynamics of the bicycle (see (6.63)) are then

Nominal External Dynamics for the Bicycle x(3) y (3) = := Next(xd


(3) (0,3)

, yd

(0,3)

, x, y ) i x(i1) xd y (i1)
(i1)

(7.41) (t)
(i1) yd (t)

xd (t)
(3) yd (t)

3 i=1

7.5

Internal Tracking Controller


Let d (t), t R+ be a desired C 2 roll-angle trajectory, with d (t) (/2, /2)

for all t 0. Let i , i {1, 2}, be such that the roots of the polynomial s2 + 2 s + 1 have

216

Automatic Control of a Bicycle

Chap. 7

strictly negative real parts. An internal tracking controller for the bicycle is then

Internal Tracking Controller for the Bicycle


u int (vint ) = c p c 1

d 2 ( d ) 1 ( d ) vint =

g p s

1 p

1+

ps vr

r vint c v

(7.42)

7.6

Internal Equilibrium Angle


We now dene the internal equilibrium angle e by the implicit relation 0= 1 g sin(e ) + p p 1+ sin(e ) p vr r + c cos(e )u cos(e )v ext p (7.43)

obtained by setting u = u ext in the internal dynamics (7.34), where uext is given by (7.40). Set the left hand side of the internal dynamics equation (7.34) to zero and divide (7.43) by cos(e )/p to get the equivalent but simpler implicit equation for the internal equilibrium angle e ,

Internal Equilibrium Angle for the Bicycle 0 = g tan(e ) + 1+ sin(e ) p vr r + cu v ext (7.44)

, vr , and u . Note that e is an implicit function of ext Equation (7.44) is a trigonometric polynomial in e , having multiple solutions. closest to 0. We will determine an estimate e for e using dynamic inversion and will The internal equilibrium manifold for the bicycle is We will be interested in only one solution however; the upright (e (/2, /2)) solution

single out the correct solution for e by choice of initial condition for the dynamic inverter.

Sec. 7.6

Internal Equilibrium Angle

217

Internal Equilibrium Manifold for the Bicycle vr , u ), E (t) = (x(0,2), y (0,2), (0,1)) | = e (, ext = 0 (7.45)

7.6.1

A Dynamic Inverter for the Internal Equilibrium Angle


We will use dynamic inversion to solve for e in (7.44) given the state of the bicycle

(x(0,2), y (0,2), (0,1)).

Let sin() p vr r + cu v ext

vr , u ) := g tan() + F (, , ext

1+

(7.46)

vr , u ) with respect to t and solve for To obtain an estimator for e , dierentiate F (e , , ext e to get 2 cos(e ) e = g sec2 (e ) + p
sin(e )v pu sin(e )vr +p r vr 1

+ 1+

sin(e ) p vr

v (u vr + r ) + cu ext

(7.47)

Recall that if h(t, (x(0,2), y (0,2))) is a smooth function of t and the state (x(0,2), y (0,2)), then N h = L ext h (x(0,2), y (0,2)) + h t (7.48)

vr , v An estimator for e is then obtained by replacing e by E (e , , r ) in (7.47), and u ext by LNext uext where N is the nominal external dynamics (7.41), vr , v 2 cos(e ) E (e, , r ) = g sec2 (e ) + p
pu sin(e )v sin(e )vr +p r vr 1

+ 1+

sin(e ) p vr

r ) + cL Next u (u vr + w ext

(7.49)

, vr , and v where we remind the reader that , r are functions of x(0,2) and y (0,2). A dynamic inverter for e is then vr , u ) + E ( vr , v e , , e, , r) e = F ( ext vr , u ) is given by (7.46). where F (, , , ext (7.50)

218 Note that N u L = ext ext + where LNext Vx LNext Vy =


1 2 vr

Automatic Control of a Bicycle

Chap. 7

r s v r c v c v r s v r
c vr

s vr

2v r s vr c 2v r c vr s

) (v vr s 2 + vr c u ) 2(ur r c r c ext s + v ext r 2 2(uextc v r s ) (v r s + vr c + vr s uext) u ext + N Vx L ext LNext Vy

2v r s vr c 2v r c vr s

Vx Vy

(7.51) Vx xd (t) Vy yd (t)


(3) (3) 2

xd (t) yd (t)
(4)

(4)

i
i=1

x(i) xd (t) y (i) yd (t)


(i)

(i)

(7.52)

Next Vx and L Next Vy terms The C 4 smoothness required of xd (t) and yd (t) are due to the L in which the fourth derivative of the reference trajectories appears.

7.7
and

Path Tracking with Balance


An internal equilibrium controller for the bicycle is obtained by letting ur = ur ext , u = u ext (ve ), 2 e 2 ( N e ) 1 ( e ) ve = L L ext Next (7.53)

to become attractive and invariant and thereby produces approximate tracking, i.e. after a time T , (x(t), y (t)) is close to (xd (t), yd (t)). Our nal controller is

As shown in Chapter 6, the internal equilibrium controller causes a neighborhood of E (t)

Internal Equilibrium Controller for the Bicycle ue (ve ) := u int (ve ) =


c p c 1

2 e 2 ( Next e ) 1 ( e ) ve := L L Next

g p s

1 p

1+

ps vr

r ve c v

(7.54)

7.8

Simulations
In this section we show the results of four simulations of the internal equilibrium

controller on the bicycle model using four simulations use four dierent reference trajecto-

Sec. 7.8

Simulations

219

ries: a straight line, a sinusoid, a circle, and a gure-eight, each revealing some capabilities and limitations of internal equilibrium control as applied to the bicycle. All simulations were performed in Matlab [Mat92] using an adaptive step-size Runge-Kutta integrator, ode45. The same physical and control parameters were used in all four simulations. These parameters are shown in Table 7.1. Table 7.1: Physical and gain parameters for the simulations. 2 3 1 3 0 1 1 20 0 100 10 c 1/2 [m] p 1 [m] g 9.8 [m/s2 ]

7.8.1

Straight Path at Constant Speed


For the rst simulation the output reference trajectory was along a straight line

at constant speed, (xd (t), yd(t)) = (5t, 0) (7.55)

where the units of length are in meters. The initial conditions for the simulation are shown in Table 7.2. Table 7.2: Initial conditions for a straight trajectory at constant speed. x(0) 0 y (0) 5[m] x (0) 2.5[m/s] y (0) 0 x (0) 0 y (0) 0 (0) 0 (0) 0

The top graph of Figure 7.7 shows the resulting path in the plane (solid) along with the desired path (dotted). The second graph of Figure 7.8 shows the steering angle graphed versus t. The third graph shows the rear wheel velocity vr (solid) with desired rear-wheel velocity vrd =
2 x 2 d d+y

The top graph of Figure 7.8 shows the tracking error (x, y ) (xd , yd )

versus t.

(dotted), both versus t. The bottom graph shows the roll-angle (solid) with internal equilibrium roll-angle e (dotted), both versus t. Note the countersteering evident in the graph of versus t. The steering angle goes positive rst, steering the bicycle away from the desired path momentarily in order to cause the bicycles roll-angle to converge towards the equilibrium roll-angle which is, initially, positive. The tracking error for the straight path goes to zero. This behavior will also be seen for the case of the circular path below. This is a result of the internal equilibrium angle e going to a constant value. Figure 7.9 shows schematically the convergence of to e (shown as a dotted line) and the resulting path. The internal equilibrium controller steers the bicycle so that its

220

Automatic Control of a Bicycle

Chap. 7

0.8

0.6

y
0.4 0.2 0 -0.2 0

10

20

30

40

50

60

70

Figure 7.7: Target path (xd , yd) = (5t, 0)[m]. The x and y scales are in meters. The bicycles path in the plane (solid) with the desired straight path (dotted). roll-angle converges to a neighborhood of e and approximately tracks e as e changes. Approximate tracking of e causes approximate tracking of the desired rectilinear trajectory in the plane. Figure 7.9 corresponds to the top of Figure 6.13 in Chapter 6 pertaining to the internal equilibrium control of the inverted pendulum.

Sec. 7.8

Simulations

221

||( x, y) - ( xd, yd)|| 2

0.8

0.6

0.4

0.2

0 0
0.05

10

12

14

[rad]

-0.05

10

12

14

5.5

vr [m/s]

5 4.5 0

10

12

14

0.04

[rad]

0.02 0 0 2 4 6 8 10 12 14

-0.02

Figure 7.8: Target path (xd , yd ) = (5t, 0) meters. The top graph shows the tracking error (x, y ) (xd , yd ) 2 versus t. The second graph shows the steering angle . The third graph shows the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted) vrd . The fourth graph shows the roll-angle (solid) with internal equilibrium roll-angle e (dotted).

222

Automatic Control of a Bicycle

Chap. 7

Figure 7.9: Internal equilibrium control causes the bicycle to steer itself so that its roll angle converges to a neighborhood of the equilibrium roll angle e , shown as a dashed line.

Sec. 7.8

Simulations

223

7.8.2

Sinusoidal Path
For the second simulation the reference trajectory was a sinusoid (xd (t), yd(t)) = 1 5t, sin( t) 5 (7.56)

where the unit of length is meters. The initial conditions for this simulation are shown in Table 7.3. These are the same as the initial conditions for the straight path as shown in Table 7.2. Table 7.3: Initial conditions for the sinusoidal trajectory at constant speed. x(0) 0 y (0) 5[m] x (0) 2.5[m/s] y (0) 0 x (0) 0 y (0) 0 (0) 0 (0) 0

Figure 7.10 shows the resulting path in the plane. Figure 7.11 shows the tracking roll and e . error (x, y ) (xd, yd ) 2 , as well as the steering angle , velocity vr and with desired velocity,
2.5 2 1.5 1 0.5 0

-0.5 -1 -1.5 -2 -2.5

10

20

30

40

50

60

70

80

90

100

1 Figure 7.10: Sinusoidal target path (xd (t), yd(t)) = (5t, sin( 5 t)) [m]. The bicycles path in the plane (solid) with the desired straight path (dotted).

Note that the tracking error becomes bounded, but non-zero due to the presence of non-zero higher order time derivatives, namely xd trajectory.
(3,5)

and yd

(3,5)

, of the output reference

224

Automatic Control of a Bicycle

Chap. 7

1 0.9 0.8 0.7

||( x, y) - ( xd, yd)|| 2

0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1

10

12

14

16

18

20

[rad]

-0.1

10

12

14

16

18

20

5.5

vr [m/s]

5 4.5 0

10

12

14

16

18

20

0.1

[rad]

-0.1

10

12

14

16

18

20

Figure 7.11: Sinusoidal target path (xd , yd) = (5t, 2 sin(0.2t)). The top graph shows the tracking error (x, y ) (xd , yd) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted).

Sec. 7.8

Simulations

225

7.8.3

Circle at Constant Velocity


For the third simulation the reference trajectory was 5 5 (xd (t), yd(t)) = (8 sin( t), 8 cos( t)) [m] 8 8 (7.57)

a circle of radius 8 [m] traversed with a constant linear velocity of 5 [m/s]. The initial conditions for this simulation are shown in Table 7.4. Table 7.4: Initial conditions for following a circular trajectory. x(0) 0 y (0) 5[m] x (0) 4.5 [m/s] y (0) 0 x (0) 0 y (0) 0 (0) 0 (0) 0

Figure 7.12 shows the resulting path in the plane. Figure 7.13 shows the tracking error and e . We could have achieved a circular path by using the internal tracking controller (x, y ) (xd, yd ) 2 , as well as the steering angle , velocity vr and with desired velocity, roll

10

10

15

10

10

15

x
Figure 7.12: Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The rst 10 seconds of the bicycles path in the plane (solid) with the desired circular path (dotted). to cause to converge to a constant nonzero angle as was shown in [Get94]. However, in that case we would have no control over the location of the circle in the plane.

226

Automatic Control of a Bicycle

Chap. 7

2.5

||(x, y) ( xd, yd)||2

1.5

0.5

0
0.4

10

12

14

16

18

20

[rad]

0.2 0 0.2 0.4 0 2 4 6 8 10 12 14 16 18 20

vr [m/s]

5.5 5

4.5 0 0.4

10

12

14

16

18

20

[rad]

0.2 0 0 2 4 6 8 10 12 14 16 18 20

t
Figure 7.13: Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The top graph shows the tracking error (x, y ) (xd, yd) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rearwheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted).

Sec. 7.8

Simulations

227

Note that as in the case of the straight path, the tracking error goes to zero because . the external tracking controller is regulating to constant values of vr and

7.8.4

Figure-Eight Trajectory
The fourth simulation used a gure-eight reference trajectory (xd (t), yd(t)) = (20 sin(2t/20), 10 sin(4t/10)) (7.58)

where the units of length are in meters. The initial conditions for this simulation are shown in Table 7.5. Table 7.5: Initial conditions for following the gure-eight trajectory. x(0) 0 y (0) 5 [m] x (0) 6 [m/s] y (0) 0 x (0) 0 y (0) 0 (0) 0 (0) 0

Figure 7.14 shows the resulting path in the plane (solid) along with the reference path (dotted). Figure 7.15 shows the tracking error (x, y ) (xd , yd )
2

along with the steering

angle , rear-wheel velocity vr , and roll-angle . Note that the tracking error becomes

bounded, but does not decay to zero, again due to the fact that the higher order time derivatives of (xd (t), yd (t)) are time-varying. In this case, due to the presence of signicant energy in the higher derivatives of (xd , yd ), though the tracking error becomes bounded exponentially, it does not converge to zero as in the case of the circle and the straight line.

228

Automatic Control of a Bicycle

Chap. 7

15

10

10

15 25

20

15

10

10

15

20

25

x
Figure 7.14: The bicycles path in the plane (solid) with the desired gure-eight path (dotted).

Sec. 7.8

Simulations

229

5 4.5 4 3.5

||(x, y) ( xd, yd)||2

3 2.5 2 1.5 1 0.5 0 0 0.5

10

15

20

25

30

[rad]

0.5 0 10

10

15

20

25

30

vr [m/s]

8 6 4 0 0.5

10

15

20

25

30

[rad]

0.5 0

10

15

20

25

30

t
Figure 7.15: Figure-eight target trajectory. The top graph shows the tracking error (x, y ) (xd , yd ) 2 . The bottom three graphs show the steering angle , the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle (solid) with internal equilibrium roll-angle e (dotted).

230

Automatic Control of a Bicycle

Chap. 7

7.9

Chapter Summary
Application of internal equilibrium control to the bicycle has resulted in a controller

that provides approximate tracking of smooth reference trajectories while retaining the bicycles balance. The ultimate bound on the tracking error has been seen to coincide with the energy in the higher time derivatives of the output reference trajectories, as expected from the theory of Chapter 6. Dynamic inversion has been successfully used to track the internal equilibrium angle for the bicycle, with the result incorporated into the tracking controller. In the case of the bicycle, the internal equilibrium angle at time t is the solution of a trigonometric polynomial equation. Other means than dynamic inversion exist for the solving of trigonometric polynomials. For instance, the internal equilibrium angle equations could have been converted to a standard polynomial in s = tan(e /2). The resulting fth order polynomial could then be solved for its ve roots using a standard polynomial solver, e.g. roots in Matlab [Mat92]. The real roots could then be converted back to angles through the arctangent angles providing a solution for the internal equilibrium angle at a time t. We do not mean to imply that this solution, though somewhat arcane, is either slow or inecient. Such an approach is used with great success in the area of robot inverse kinematics [Man92] for robotic manipulators having rotary joints. In contrast, however, the conceptual simplicity of dynamic inversion, combined with its ease of implementation, its continuous-time dynamic nature, and its accurate results hold substantial appeal and convenience. Dynamic inversion may also be applied whether or not the implicit equation dening e is a trigonometric polynomial. The simulations have shown that countersteering emerges naturally as a result of the action of the internal equilibrium controller. Countersteering may be regarded as the result of tracking the equilibrium roll angle rather than trying to track the path in the plane directly. function. Then the useful root, the one in (/2, /2), could be picked out of the list of

231

Chapter 8

Conclusions
In this dissertation we have presented a methodology for the construction of nonlinear dynamic systems that produce approximations to the inverse problems having timevarying vector-valued solutions. We have demonstrated application of these results to a variety of problems in matrix analysis and nonlinear control including matrix inversion, polar decomposition, implicit trajectory tracking, robot control, and the inversion of nonlinear control systems, particularly nonminimum-phase systems. In this nal chapter we will briey review the results of the preceeding chapters. Then, after some general observations, we will make some suggestions for future work.

8.1

Review
In this section we review the results of each chapter of this dissertation.

Chapter 2. Dynamic Inversion of Nonlinear Maps. The dynamic inverse of a nonlinear map was dened. The denition of the dynamic inverse was such that when the dynamic inverse was composed with a nonlinear map, simple quadratic Lyapunov arguments were all that was necessary to prove convergence of the solution of the resulting dynamic inverter to the desired root. It was shown how individual dynamic inverses and derivative estimators could be combined to produce the solutions to coupled inverse problems using a single dynamic system. This resulted in a dynamic inverter that solved for a dynamic inverse, while simultaneously using the resulting dynamic inverse to solve the main inverse problem of interest. Examples showed how dynamic inversion could provide inversion of time-varying matrices and estimation of

232

Conclusions

Chap. 8

the intersections of time-varying curves. Dynamic inversion was also used to construct a dynamic controller for a nonlinear control system. Chapter 3. Dynamic Inversion and Polar Decomposition of Matrices. Building upon the use of dynamic inversion in inverting time-varying matrices, we extended its application to the asymptotic inversion of xed matrices. Then by constructing a time-parameterized homotopy from the identity to a xed matrix we showed how we could use our result on asymptotic inversion of time-varying matrices to invert a spectrally restricted matrix in nite time. In order to remove the spectral restrictions on the xed matrix to be inverted, we constructed a dynamic inverter which asymptotically solved for the inverse of the positive-denite part of the polar decomposition of a time-varying matrix. Through additional matrix multiplications the unitary component of the polar decomposition, as well as the inverse of the decomposed matrix were also produced. Then by using another time-parameterized homotopy from the identity, we obtained polar decomposition of any xed matrix in nite time. Chapter 4. Tracking Implicit Trajectories. For a partially feedback linearizable

nonlinear control system, we showed how dynamic inversion could be used to allow

exponentially convergent tracking of output reference trajectories that were implicitly dened. We obtained tracking by substituting into a conventional control law the solution of a dynamic inverter and estimates of the derivatives of that solution, for the actual solution and its derivatives. We showed that as long as the output reference trajectory and its derivatives were suitably bounded, and as long as the initial conditions for the dynamic inverter were suciently close to the actual reference trajectory and its derivatives, application of the implicit tracking controller preserved a bound on the internal dynamics of the controlled dynamic system. Chapter 5. Joint-Space Tracking of Workspace Trajectories in Continuous Time. We showed how robotic manipulator control strategies may be divided up in to four classes, depending upon whether one wishes to stabilize trajectories in the workspace or the joint-space using workspace errors or joint-space errors. Then by applying the implicit tracking controller of Chapter 4, we showed how we could achieve exponentially convergent tracking of workspace trajectories using joint-space errors with no need to call on discrete-time inverse-kinematic algorithms or matrix inversion.

Sec. 8.2

Observations

233

Chapter 6. Approximate Output Tracking for a Class of Nonminimum-Phase Systems. In this chapter we considered the problem of output tracking for a class of nonlinear nonminimum-phase systems called balance systems. Under suitable conditions, an internal equilibrium manifold, a submanifold of the systems state-space, could be constructed from the internal dynamics of the system. A controller was then constructed which made a neighborhood of the internal equilibrium manifold attractive and invariant. This resulted in approximate tracking of both the output reference trajectory, as well as a bounded internal trajectory, the internal equilibrium angle. Application of the internal equilibrium controller to tracking control for the inverted pendulum on a cart, and comparison to a linear quadratic regulator demonstrated the increase in performance of the internal equilibrium controller over the linear quadratic regulator. Chapter 7. Automatic Control of a Bicycle. We converted the nonlinear nonholonomic model of a simple bicycle to internal/external convertible form. We showed how both roll-angle and rear-wheel velocity could be easily controlled, ignoring the resulting path in the ground plane. Then we used the approximate output tracking control methodology of Chapter 6 to construct an internal equilibrium controller for approximate tracking of time-parameterized ground paths while retaining balance. Simulation results veried approximate stabilization of a number of time-parameterized ground paths, along with maintenance of vehicle balance.

8.2
8.2.1

Observations
Dynamic Time v.s. Computational Time
Consider the use of discrete inversion routines in the context of a digital implemen-

tation of the control of a continuous-time dynamic system. Discrete inversion essentially introduces a computational time axis in addition to the dynamic time axis along which the continuous-time process ows. For instance, assume that we wish to control a plant with a compensator that is represented by a dierential-algebraic equation, containing both an ordinary dierential equation and a set of implicit algebraic relations. The time axis of both the plant and the dierential equation part of the controller is the dynamic time axis. The dierential equations must be integrated with a discrete integrator. Before integration can proceed however, the set of implicit algebraic equations must be solved for, say, a root.

234

Conclusions

Chap. 8

Using a discrete inverter this computation of the root is sequential along a discrete computational time axis. Only when the discrete inverter has completed its last step can the integration proceed along the dynamic time axis. Now consider the same problem where dynamic inversion is used. A dynamic inverter for the solution of the implicit algebraic equations is an ordinary dierential equation. It may be appended to the dierential equations of the compensator and integrated along with the compensator. Thus the need for a computational time axis in addition to the dynamic time axis is removed. Furthermore, all responsibility for accuracy of the inversion is placed squarely in the lap of the integration routine. Accuracy of integration routines is a well studied problem.

8.2.2

Realization of Dynamic Inverters

Dynamic inversion is an analog computational paradigm. Consequently many dierent physical realizations of dynamic inversion are possible, e.g. analog electronic, mechanical, chemical, optical. In the current prevalent digital computational technology it is most often the case that dierential equations such as the dynamic inversion systems (2.53),(2.80), and (2.108) are solved using numerical computation on digital computers. Consequently a comparison of the continuous estimator and discrete root nding techniques such as Newtons method would, in the digital domain, be more appropriately made by pairing the continuous estimator with an algorithm for integration of ordinary dierential equations. For example, one might compare a dynamic inverter integrated with a Runga-Kutta integrator of a particular order with the Newton-Raphson algorithm. Though such comparisons are beyond the scope of this dissertation, each such pairing of dynamic inverter and integration routine results in a discrete root estimator. Indeed, the direct comparison of a discrete estimator to a continuous estimator may be seen as unfair or inappropriate. The continuous dynamic approach of dynamic inversion, however, has the virtue that it is independent of implementation. It may be realized in an analog as well as a digital manner (through association with an integrator). Its combination with a continuous-time plant and controller in a control system allows a seamless incorporation of root-solving without the need to mix continuous-time and discrete-time analysis.

Sec. 8.3

Future Work

235

8.3

Future Work
In this section we highlight a few directions in which the results of this dissertation

may be extended in future work.

8.3.1

Methods for Producing Dynamic Inverses


For the most part we have relied upon linear dynamic inverses. For many impor-

tant inverse problems no linear dynamic inverse exists and nonlinear dynamic inverses must be found. As yet we have no methodology for producing nonlinear dynamic inverses in a variety of problems. Even in the case that a linear dynamic inverse exists, the determination of an expression for the dierential D1 F (, t) may be impractical. Ways can be developed to dynamically determine a dynamic inverse by a method akin to nite dierences, where one tracks points near the approximator for the inverse solution in order to construct a dynamic approximation of a linear dynamic inverse.

8.3.2

Dierential-Algebraic Systems
Dynamic inversion allows one to incorporate the time-varying solutions to algebraic

equations into dynamical systems. This has been shown in this dissertation in association with implicit trajectory tracking as well as internal equilibrium control. In the general context of dierential-algebraic systems which arise frequently in the domain of constrained mechanical systems, for instance, dynamic inversion may be able to provide a natural context in which to incorporate the solutions of algebraic constraints into dynamics. Constraint enforcement in the integration of dierential-algebraic systems and conservative mechanical systems may also be fruitful areas of application.

8.3.3

Inverse Kinematics with Singularities


We have shown how dynamic inversion may be applied to the problem of inverting

robot kinematics as long as a continuous isolated solution to the inverse kinematics problem exists. We also showed that dynamic inversion may be used for redundant manipulators, allowing us to arrive at a unique solution. We have avoided the problem of kinematic singularities, points in joint-space where the Jacobian of the forward-kinematics map drops rank. At these points, continuous isolated solutions meet. Dynamic inversion as stated in this dissertation cannot be used at such singularities. We may, however, be able to modify dynamic inversion, incorporating manipulator dynamics, so that in the context of workspace

236

Conclusions

Chap. 8

tracking, the inverse path through such singularities becomes unique based upon the state of the manipulator.

8.3.4

Tracking Multiple Solutions


Dynamic inversion may be used for tracking multiple solutions of an inverse prob-

lem. One need only set up a dynamic inverter for each solution. When solutions cross, however, dynamic inversion breaks down. There are many applications where there is a physically sensible way of choosing the continuation of solutions. This sensibility needs to be incorporated into the inversion process.

8.3.5

Tracking Optimal Solutions


Most applications of dynamic inversion in this dissertation have been to solutions

of equations for which the number of unknowns is equal to the number of equations. When the number of equations is less than the number of unknowns, other criteria must be used to produce a unique solution. Typically one chooses a cost function on the solution space and uses minimization of that cost function as the additional criterion for uniqueness, arriving at an optimal solution. In future work dynamic inversion will be extended to the tracking of optimal solutions.

8.3.6

Control System Design


Though it is common to implement control systems through either analog or dig-

ital circuits and computers, the study of dynamic inversion suggests that this limited view of controllers, exible though it may be, is not the only view possible. As an analog computational paradigm, dynamic invertion may be realized by a variety of physical processes. In particular dynamic inversion may be designed into a process or mechanism in order to achieve control without the aid of an electrical digital or analog circuit. This may be particularly useful in the control of very fast processes that happen on time scales too short for circuit realizations of controllers. Such processes occur in, for instance, fast chemical reactions, such as combustion, and in the area of inertial connement fusion. At the same time, dynamic inversion may provide a way of viewing physical processes as computation. This may lead to new types of computers and new insights into the design of materials and structures.

237

Bibliography
[AG80] E. Allgower and K. Georg. Simplical and continuation methods for approximating xed points and solutions to systems of equations. SIAM Review, 22(1), January 1980. [AM90] B.D.O. Anderson and J.B. Moore. Optimal control : linear quadratic methods. Prentice Hall, Englewood Clis, New Jersey, 1990. [AMR88] R. Abraham, J.E. Marsden, and T. Ratiu. Manifolds, Tensor Analysis, and Applications, volume 75 of Applied Mathematical Sciences. Springer-Verlag, New York, second edition, 1988. [BH69] A.E. Bryson and Y.-C. Ho. Applied optimal control; optimization, estimation, and control. Blaisdell, Waltham, Mass., 1969. [BKMM94] A.M. Bloch, P.S. Krishnaprasad, J.E. Marsden, and R.M. Murray. Nonholonomic mechanical systems with symmetry. Technical Report CDS 94-013, California Institute of Technology, Pasadena, 1994. Accepted for Archive for Rational Mechanics and Analysis. [BL93] M.D. Di Benedetto and P. Lucibello. Inversion of nonlinear time-varying systems. IEEE Transactions on Automatic Control, 38(8):12591264, August 1993. [Blo85] A.M. Bloch. Estimation, principal components, and hamiltonian systems. Systems and Control Letters, 6, 1985. [Blo90] A.M. Bloch. Steepest descent, linear programming, and hamiltonian ows. Contemporary Mathematics, 114, 1990.

238 [BRM92]

Bibliography A.M. Bloch, M. Reyhanoglu, and N.H. McClamroch. Control and stabilization of nonholonomic dynamic systems. IEEE Transactions on Automatic Control, 37(11):17461757, November 1992.

[Bro89]

R. W. Brockett. Least squares matching problems. Linear Algebra and Its Applications, 122:761777, Sep-Nov 1989.

[Bro91]

R. W. Brockett.

Dynamical systems that sort lists, diagonalize matrices,

and solve linear programming problems. Linear Algebra and Its Applications, 146:7991, February 1991. [BS72] R.H. Bartels and G. H. Stewart. Algorithm 432, solution of the matrix equation AX+XB=C. Communications of the ACM, 15, 1972. [CD91a] F.M. Callier and C.A. Desoer. Linear System Theory. Springer Texts in Electrical Engineering. Springer-Verlag, New York, 1991. [CD91b] M.T. Chu and K.R. Driessel. Constructing symmetric nonnegative matrices with prescribed eigenvalues by dierential equations. SIAM Journal on Mathematical Analysis, 22(5), September 1991. [CDK87] L.O. Chua, C.A. Desoer, and E.S. Kuh. McGraw-Hill, 1987. [Chu92] M.T. Chu. Matrix dierential equations - a continuous realization process for linear algebra problems. Nonlinear Analysis-Theory Methods and Applications, 18(12), June 1992. [Chu95] M.T. Chu. Scaled toda-like ows. Linear Algebra and its Applications, 215, 15 January 1995. [CR93] L.O. Chua and T. Roska. The cnn paradigm. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 40(3), March 1993. [Cra89] J.J. Craig. Introduction to Robotics, Mechanics and Control. Addison Wesley, New York, second edition, 1989. [DJW84] J. P. Dunyak, J. L. Junkins, and L. T. Watson. Robust nonlinear least squares estimations using the Chow-Yorke homotopy method. Journal of Guidance, 7:752755, 1984. Linear and Nonlinear Circuits.

Bibliography [DN92]

239

F. Delebecque and R. Nikoukhah. A mixed symbolic-numeric software environment and its application to control systems engineering. Recent Advances in Computer-Aided Control Systems Engineering, 1992.

[DP94]

S. Devasia and B. Paden. Exact output tracking for nonlinear time-varying systems. In Proceedings of the 33rd IEEE Conference on Decision and Control, volume 3, pages 23462355, Lake Buena Vista, December 1994. IEEE.

[DPC94]

S. Devasia, B. Paden, and Degang Chen. Nonlinear inversion-based regulation. In American Control Conference, Baltimore, July 1994. IEEE.

[Far92]

F.A. Farouhar. Robust Stabilization of High Speed Oscillations in Single Track Vehicles by Feedback Control of Gyroscopic Moments of Crankshaft and Engine Inertia. PhD thesis, University of California, May 1992.

[FPEN86]

G.F. Franklin, J. D. Powell, and A. Emami-Naeini. Feedback Control of Dynamic Systems. Addison Wesley, Menlo Park, 1986.

[Fra77]

B.A. Francis. The linear multivariable regulator problem. SIAM Journal of Control and Optimization, 15, 1977.

[FSR90]

G. Franke, W. Suhr, and F. Riess. An advanced model of bicycle dynamics. European Journal of Physics, 11(2):11621, March 1990.

[GBLL94]

J.W. Grizzle, M.D. Di Benedetto, and F. Lamnabhi-Lagarrigue. Necessary conditions for asymptotic tracking in nonlinear systems. IEEE Transactions on Automatic Control, 39(9):17821794, September 1994.

[Get94]

N. H. Getz. Control of balance for a nonlinear nonholonomic non-minimum phase model of a bicycle. In American Control Conference, Baltimore, June 1994. American Automatic Control Council.

[Get95]

N. H. Getz. Internal equilibrium control of a bicycle. In 34th IEEE Conference on Decision and Control, New Orleans, 13-15 December 1995. IEEE.

[GH90]

J. Guckenheimer and P. Holmes. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, volume 42 of Applied mathematical sciences. Springer-Verlag, New York, 1990.

240 [GH93]

Bibliography S. Gopalswamy and J. K. Hedrick. Tracking nonlinear non-minimum phase systems using sliding control. International Journal of Control, 57(5):1141 1158, 1993.

[GH95]

N. H. Getz and J. K. Hedrick. An internal equilibrium manifold method of tracking for nonminimum phase systems. In American Control Conference, Seattle, 21-23 June 1995. American Automatic Control Council.

[GM95a]

N. H. Getz and J. E. Marsden. Control for an autonomous bicycle. In IEEE International Conference on Robotics and Automation, Nagoya, Aichi, Japan, 21-27 May 1995. IEEE.

[GM95b]

N. H. Getz and J. E. Marsden. Dynamical methods for polar decomposition and inversion of matrices. Technical Report 624, Center for Pure and Applied Mathematics, Berkeley, California, 5 January 1995. Submitted to Linear Algebra and its Applications.

[GM95c]

N. H. Getz and J. E. Marsden. Tracking implicit trajectories. In IFAC Symposium on Nonlinear Control Systems Design, Tahoe City, 25-28 June 1995. International Federation of Automatic Control.

[GMW81]

P.E. Gill, W. Murray, and M.H. Wright. Practical Optimization. Academic Press, 1981.

[GNL79]

G.H. Golub, S. Nash, and C. Van Loan. A Hessenberg-Scheer method for the problem AX+XB=C. IEEE Transactions on Automatic Control, AC-24, 1979.

[GS93]

R. Gurumoorthy and S.R. Sanders. Controlling nonminimum phase nonlinear systems-the inverted pendulum on a cart example. In Proceedings of the 1993 American Control Conference, volume 1, pages 6805, San Francisco, June 1993. ACC.

[Han88]

R. S. Hand. Comparisons and stability analysis of linearized equations of motion for a basic bicycle model. Masters thesis, Cornell, 1988.

[Har82]

P. Hartman. Ordinary Dierential Equations. Birkhauser, Boston, second edition, 1982.

[HJ85]

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, New York, 1985.

Bibliography [HJ91]

241

R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge University Press, New York, 1991.

[HM94]

U. Helmke and J. B. Moore. Optimization and Dynamical Systems. Communications and Control Engineering. Springer-Verlag, New York, 1994.

[HMP94]

U. Helmke, J. B. Moore, and J. E. Perkins. Dynamical systems that compute balanced realizations and the singular value decomposition. Siam Journal of Matrix Analysis and Its Applications, 15:733754, July 1994.

[HMS94]

L.R. Hunt, G. Meyer, and R. Su. Computing particular solutions. In Proceedings of the 33rd IEEE Conference on Decision and Control, volume 3, pages 25201, Lake Buena Vista, December 1994. IEEE.

[HR92a]

J. Huang and W. J. Rugh. An approximation method for the nonlinear servomechanism problem. IEEE Transactions on Automatic Control, 37(9):1395 1398, September 1992.

[HR92b]

J. Huang and W. J. Rugh. Stabilization on zero-error manifolds and the nonlinear servomechanism problem. IEEE Transactions on Automatic Control, 37(7):10091013, July 1992.

[HSM92]

J. Hauser, S. Sastry, and G. Meyer. Nonlinear control design for slightly non-minimum phase systems - application to v/stol aircraft. AUTOMATICA, 28(4):665679, July 1992.

[IB90]

A. Isidori and C. I. Byrnes. Output regulation of nonlinear systems. IEEE Transactions on Automatic Control, 35(2):131140, 1990.

[IM91]

A. Isidori and C. H. Moog. On the nonlinear equivalent of the notion of transmission zeros. In C. I. Byrnes and A. Kursonski, editors, Modelling and Adaptive Control. Springer-Verlag, 1991.

[Isi89]

A. Isidori. Nonlinear Control Systems, An Introduction. Springer-Verlag, New York, second edition, 1989.

[JLS88]

J.-S. Jang, S.-Y. Lee, and S.-Y Shin. An optimization network for matrix inversion. In D. Z. Anderson, editor, Neural Information Processing Systems, pages 397401. American Institute of Physics, New York, 1988.

242 [Kad94]

Bibliography R.R. Kadiyala. Sys view: a visualization tool for viewing the regions of validity and attraction of nonlinear systems. In IEEE/IFAC Joint Symposium on Computer-Aided Control System Design, Tuscon, 7-9 March 1994. IEEE/IFAC.

[Kai80] [Kha92] [KHO86]

T. Kailath. Linear Systems. Prentice-Hall, 1980. H.K. Khalil. Nonlinear Systems. Macmillan, New York, 1992. P. Kokotovic, H.K.Khalil, and J. OReilly. Singular Perturbation Methods in Control: Analysis and Design. Academic Press, London, 1986.

[Kre92]

A.J. Krener. The construction of optimal linear and nonlinear regulators. Systems Models and Feedback: Theory and Applications, 1992.

[Man92]

D. Manocha. Algebraic and numeric techniques for modeling and robotics. PhD thesis, University of California at Berkeley, Berkeley, California, 1992.

[Mat92] [Mea89]

Matlab. The MathWorks, Inc., Natick, Mass., 1992. C. Mead. Analog VLSI and Neural Systems. Addison-Wesley, Reading, Mass., 1989.

[MH83]

J.E. Marsden and T.J. Hughes. Hall, Englewood Clis, N.J., 1983.

Mathematical foundations of elasticity.

Prentice-Hall Civil Engineering and Engineering Mechanics Series. Prentice-

[MLS94]

R. M. Murray, Z. Li, and S. S. Sastry. A Mathematical Introduction to Robotic Manipulation. CRC Press, 1994.

[Nak91]

Y. Nakamura. Advanced Robotics: Redundancy and Optimization. AddisonWesley Series in Electrical and Computer Engineering. Control Engineering. Addison-Wesley, Reading, Mass., 1991.

[NDV92]

R. Nikoukhah, F. Delebecque, and D. Van Wissel. Software for simulation and control of multibody systems. In Workshop on Dynamics and Control of Multibody Systems. Army Research Oce, April 1992.

[NF72]

Ju. I. Neimark and N. A. Fufaev. Dynamics of Nonholonomic Systems, volume 33 of Translations of Mathematical Monographs. American Mathematical Society, Providence, Rhode Island, 1972.

Bibliography [NTV91a]

243

S. Nicosia, A. Tornamb` e, and P. Valigi. A solution to the generalized problem of nonlinear map inversion. Systems and Control Letters, 17(5), 1991.

[NTV91b]

S. Nicosia, A. Tornamb` e, and P. Valigi. Use of observers for the inversion of nonlinear maps. Systems and Control Letters, 16(6), 1991.

[NTV94]

S. Nicosia, A. Tornamb` e, and P. Valigi. Nonlinear map inversion via state observers. Circuits, Systems, and Signal Processing, 13(5), 1994.

[PC89]

T. S. Parker and L. O. Chua. Practical numerical algorithms for chaotic systems. Springer-Verlag, Berlin, New York, 1989.

[Rol72]

R. D. Roland, Jr. Bicycle dynamics, tire characteristics, and rider modelling. Technical report, Cornell Aeronautical Laboratory, Inc., Bualo, 1972.

[Rut54]

H. Rutishauser.

Ein innitesimales analogon zum quotienten-dierenzen-

algorithmus. Archiv der Mathematik, 1954. [Rut58] H. Rutishauser. Solution of eigenvalue problems with the lr-transformation. National Bureau of Standards Applied Mathematics Series, 49, 1958. [SB89] S. S. Sastry and M. Bodson. Adaptive Control: Stability, Convergence, and Robustness. Prentice Hall, Englewood Clis, New Jersey, 1989. [Sha71] R. S. Sharp, Jr. The stability and control of motorcycles. Mechanical Engineering Science, 13(5), 1971. [Smi91] S. T. Smith. Dynamical systems that perform the singular value decomposition. Systems and Control Letters, 16(5), 1991. [Spo96] M.W. Spong. The control of underactuated mechanical systems. In First International Conference on Mechatronics, 26-29 January 1996. [SV89] [TD93] M. Spong and M. Vidyasagar. Robot Dynamics and Control. Wiley, 1989. K. Tchon and I. Duleba. On inverting singular kinematics and geodesic trajectory generation for robotic manipulators. Journal of Intelligent and Robotic Systems, 8:325359, 1993. [Tor90a] A. Tornamb` e. An asymptotic observer for solving the inverse kinematic problem. In Proceedings of the American Control Conference, San Diego, 1990.

244 [Tor90b]

Bibliography A. Tornamb` e. Use of high-gain observers in the inverse kinematic problem. Applied Mathematical Letters, 3(1), 1990.

[Tor91]

A. Tornamb` e. Asymptotic inverse dynamics of non-linear systems. International Journal of Systems Science, 22(12), 1991.

[vWN95]

D. von Wissel and R. Nikoukhah. Obstacle-avoiding trajectory optimization: Example of a riderless bicycle. 23(2), 1995.

[Wan93]

J. Wang. A recurrent neural network for real-time matrix inversion. Applied Mathematics and Computation, 55:89100, 1993.

[Wat81]

L. T. Watson. Engineering applications of the Chow-Yorke algorithm. Applied Mathematical Computation, 9:111133, 1981.

[WE84]

W.A. Wolovich and H. Elliot. A computational technique for inverse kinematics. In 23rd IEEE Conference on Decision and Control, Las Vegas, 12-14 December 1984. IEEE.

[Wei72]

D. H. Weir. Motorcycle Handling Dynamics and Rider Control and the Eect of Design Conguration on Response and Performance. PhD thesis, University of California at Los Angeles, 1972.

245

Appendix A

Notation and Terminology


We gather here some notation and denitions used throughout this dissertation. Z+ a := b R C Rn Rmn En Let Z+ denote the integers {0, 1, . . . , }. The expression a := b or b =: a denes a as being equivalent to the expression b.

The real numbers. The complex numbers. The real n-vectors. The real m n matrices. The n-dimensional Euclidean space formed from Rn and the Euclidean metric x =
n i=1

x2 i

1/2

for x Rn .

Co Co +

real part.

the open left-half of the complex plane consisting of all s C having strictly negative the open right-half of the complex plane consisting of all s C having strictly positive

real part. k r R+

For any k Z+ , k 1, let k denote the set of integers {1, 2, . . ., k}. For a list of positive integers r, r is the largest positive integer in r. Dene R+ := {t R|t 0}, the set from which we draw our values of time, t. The empty set.

246 sign(a) For a R,

Notation and Terminology

Appendix A

sign(a) = We will consider sign(0) to be undened. y (k) (t)

1, if a > 0 1, if a < 0

The kth derivative with respect to time, t, of a functions y (t) will be denoted

y (k)(t), where k is in Z+ , and y (0)(t) = y (t). y (n1 ,n2 ) (t) For y (t) Rn , and n1 , n2 Z+ , with n1 < n2 , y (n1 ,n2 )(t) := [y (n1 ) , y (n1 +1) , . . ., y (n2 ) ](t)

Rn2 n1 +1 .
k [0, ) Cn

The set of k times continuously dierentiable functions on the interval [0, ).

Ck

map F : X Y whose partial derivatives

Let i := (i1 , . . . , ik ) denote any length k combination of the k integers {1, . . ., k}. A k F (x) xi1 xi2 xik

be in C k on X . x
p

are dened and continuous for all x X and for all length k combinations i is said to The p-norm x of a vector x Rn is dened by
n 1/p

x The norm

=
i=1

| xi |

of a vector valued function : R+ Rn ; t (t) is dened as

:= sup(max(|i(t)|)).
t in

(, )

For (, ) Rnn Rn , let (, )


2

This is the l2 norm of the n n + 1 matrix [ | ]. M R, M L

:=

n i,j =1

|ij |2 +

i=1

1 / 2

(A.1)

then M L := (M T M )1 M T Rnm is the left inverse of M , and M L M = I Rmm .

right inverse of M . Note that M M R = I Rmm . If M Rmn , m n, is full rank,

If M Rmn , m n, is full rank, then M R := M T (M M T ) Rnm is the

Appendix A Br

Notation and Terminology

247

For each dimension n we dene the open ball Br := {x Rn : x < r }. The choice of a particular norm

will be apparent from context. In order to emphasize the

dimension of Br we will often specify the set having the same dimension as Br for which Br is a subset, e.g. Br Rn . y () For a function f : A B , f () refers to the function f evaluated on its entire domain

A, as opposed to f (x) which refers to the function f evaluated at a single x A. Thus having domain A and codomain B .

while f (x) is a value in B , f () is an element of the function space of all functions For y : [0, ) Rn , y C k , y () = supt0 { y (t)
,

Br
(k )

(k )

(k )

y (1)(t)

, . . . , (k )

y (k)(t)

}.

For y : [0, ) Rn , y C k , and real number r > 0, Br w, we dene vf() := F (x, w, u)

(k )

= {y | y

< r }.

vf() Given a dynamic system : x = F (x, w, u) with input u and exogenous parameters

so that vf() is the vector eld associated with the dynamic system . s , c For an angle , s := sin() and c = cos(). For any map F (a1 , a2 , . . . , an ), Dk F (a1 , a2 , . . . , an ) is the partial deriva-

Dk F (a1 , a2 , . . . , an)

tive of F with respect to ak . The l th derivative with respect to the kth argument is
l F (a1 , a2 , . . . , an ). denoted Dk

Dj,k F (a1 , a2 , . . . , an) The mixed partial derivative of F with respect to the j th and then the kth argument Dj,k F (a1 , a2 , . . . , an ) := F (a1 , . . . , aj , . . . , ak , . . . , an ) aj ak k F (a1 , . . ., ak , . . . , an) ak a
l times

l Dk F (a1 , a2 , . . . , an) The repeated partial derivative of F with respect to the kth argument l Dk F (a1 , a2 , . . . , an ) :=

Lf (x)

derivative of in the direction of f denoted Lf is dened as Lf (x) := d(x) f (x).

For smooth vector elds f and g , and real-valued function : Rn R the Lie

The expression Lg Lf (x) denotes Lg (Lf (x)), and the k-times repeated Lie derivative
0 Lf Lf . . . Lf (x) is denoted Lk f . By convention Lf (x) (x).

248 f (x, t) L

Notation and Terminology

Appendix A

For smooth t-dependent vector elds f and g , and real-valued function : f is dened as Rn [0, ) R, L f (x, t) := D1 (x, t) f (x, t) + D2 (x, t) L gL f (x, t) denotes L g (L f (x, t)), and the k-times repeated Lie derivaThe expression L f L f . . . L f (x, t) is denoted L k . By convention L 0 (x, t) (x, t). tive L
f f

Lipschitz

exists a non-negative real-valued piecewise continuous function g : R+ R+ such that for all x, y S , f (x) f (y ) g (t) x y . O(g (x)) If for some xed M R, 0 < M < , and for some positive real-valued
k i=1 fi (x)

A map f (x) is Lipschitz continuous or Lipschitz on an open set S Rn if there

we say that h(x) = O(g1 (x), . . ., gk (x)). T A

f (x) = O(g (x)) as x x0 . If h(x) =

function g : Rn R, it is true that limxx0 f (x) /|g (x)| M , then we say that where fi (x) = O(gi(x)), i k, then

The interval [, ] R with the points 0 and 2 identied is the torus denoted T.
2

The l2 norm on A = [Ai,j ]i,j n Rnn , is dened by A


2

:= (
i,j n

|Ai,j |2 )1/2 .

(A.2)

The l norm on A is dened by A GL(n, R) O(n, R) S(n, R) SE(3)

:= maxi,j n |Ai,j |.

The group of nonsingular n-by-n matrices, {M Rnn | det(M ) = 0}. The group of orthogonal n-by-n matrices, {M Rnn |M T M = I }. The vector space of symmetric n-by-n matrices, {M Rnn |M T = M }.

The special Euclidean group of transformations R3 R3 x Rx + p

with x and p in R3 , and where R Rnn satises det(R) = 0 and RRT = I . s(n) := n(n + 1)/2 X, (X ), x, (x) The dimension of S(n, R), i.e. s(n) := 1 2 n(n + 1).

Given any X S(n, R), and a particular ordered basis {1 , . . . , s(n)} of Rs(n) , with X S(n, R), assume that X = is(n) xi i with xi R. Then X (X ):= [x1 , . . . , xs(n) ]T . Also, given any x Rs(n) , x (x):=
i is(n) x i

S(n, R).

Appendix A (M ) ,

Notation and Terminology

249

The spectrum of M GL(n) is the set of eigenvalues of M and is denoted (M ). We will associate with the symbol meaning an exact solution of F (, t) = 0. When is the state of a dynamic system, it may be regarded as an approximator of . Otherwise, refers to the rst argument of F (, t). Consider a vector Rn , and an associated map F : Rn R+ Rn ; (, t) F (, t).

(z, t) F

Given maps F (, t) and G(w, t) and a continuous isolated equilibrium (t) satisfy (z, t) := F (z + , t) and G (w, t) := G(w, z + , t). ing F (, t) = 0 for all t, we dene F A continuous function : [0, r ) R+ is in class K if (0) = 0 and is strictly

class K

increasing. Note that the sum of two class K functions is also of class K, as is their 1 ().

product and their composition 1 (2 ()). Also, if () is class K then so is its inverse A continuous function : [0, a) R+ R+ belongs to class KL if for each

class KL

decreasing with respect to t, with (r, t) 0 as t .

xed t1 R+ , r (r, t1) is in class K, and for each xed r1 [0, a), t (r1 , t) is

Marks the end of theorems, corrolaries, lemmas, and propositions. Marks the end of proofs of theorems, corrolaries, lemmas, and propositions. Marks the end of remarks, examples, denitions, claims, assumptions, algorithms, and properties.

250

Appendix B

Some Useful Theorems


In this section we gather, for the readers convenience, some useful theorems and techniques referred to, but not original to this dissertation.

B.1
useful.

A Comparison Theorem
The following theorem which we will refer to as the comparison theorem will prove

Theorem B.1.1 Comparison Theorem. Let f : R R; x f (x) and g : R R; x g (x) be Lipschitz in x. Let x(t) denote the solution to x = f (x) with x(0) = x0 , and let y (t) denote the solution to y = g (y ) with y (0) = y 0 x0 . Assume that for all x R, f (x) g (x). Then for all t R+ , x(t) y (t). Proof: See Hartman [Har82], Theorem 4.1, page 26.

B.2

Taylors Theorem
We will rely upon Taylors theorem in some of our arguments. For convenience we

include a version here from [GMW81]. Theorem B.2.1 Taylors Theorem. If f (x) C r , then there exists a scalar , with 1 1 1 f (x + h) = f (x) + hf (x) + h2 f (x) + + hr1 f (r1) (x) + hr f (r) (x + h) 2 (r 1)! r!

0 1, such that

Sec. B.4

Tracking Convergence for Integrator Chains

251

where f (r) (x) denotes the r th derivative of f evaluated at x.

B.3

Singularly Perturbed Systems


The following theorem, from Khalil [Kha92], provides sucient conditions under

which one may conclude exponential stability of a singularly perturbed system. Theorem B.3.1 Consider the system x = f (t, x, z, ) z = g (t, x, z, ) Assume that for all (t, x, ) [0, ) Br [0, 0 ] i. f (t, 0, 0, ) = 0 and g (t, 0, 0, ) = 0. ii. The equation 0 = g (t, x, z, 0) has an isolated solution z = h(t, x) such that h(t, 0) = 0. iii. The functions f, g, and h and their partial derivatives up to order 2 are bounded for z h(t, x) B . iv. the origin of x = f (t, x, h(t, x), 0) is exponentially stable. v. The origin of dy = g (t, x, y + h(t, x), 0) d is exponentially stable uniformly in (t, x). Then there exists > 0 such that for all < , the origin of (B.1) is exponentially stable. (B.1)

Proof: See Khalil [Kha92], Theorem 8.3, page 467.

252

Some Useful Theorems

Appendix B

B.4

Tracking Convergence for Integrator Chains


A standard result of linear control theory, as elementary as it is useful, is the

following: Theorem B.4.1 Consider the control system j j i = 1 + 1 , j r i , i p ri = ui i y = 1


i

(B.2)

j r R be such that all of the roots of the Let yd (t) Cp where r := maxip {ri}. Let i

polynomials

ri

sri +
j =1

j j 1 i s , ip

(B.3)

have strictly negative real parts. Then the control


ri

ui =

ri ydi

k=1

k k i i ydi

(k1)

, ip

(B.4)

causes y (t) to converge to yd (t) exponentially. Remark B.4.2 The utility of the controller B.4 in the context of nonlinear control is greatly enhanced by feedback linearization of nonlinear systems (see Section C of this appendix) in which a state-dependent coordinate transformation converts a nonlinear system to the form B.2, after which B.4 may be applied in order to render the nonlinear system stable. Proof of Theorem B.4.1: Dene the coordinate change ei := i ydi tem B.2 takes the form
j j (j 1)

, i p, j ri

(B.5)

The coordinates ej i are referred to as error coordinates. In the error coordinates, the syse j i
i e r i

= ui ydi i
ri

= ej 1 + 1, j r i , i p
(r )

(B.6)

and the input B.4 takes the form


ri ui = ydi k k i ei , i p.

(B.7)

k=1

Sec. B.5 Combining (B.6) with (B.4) gives e j i


i e r i

A Converse Theorem

253

= ej 1 + 1, j r i , i p =
ri k=1 k ek , i p i i

(B.8)

Equation B.8 is referred to as the error dynamics of the tracking system. Let ei :=
ri [e1 i , . . ., ei ], i p. Then (B.8) e 1 e 2 ... e p

is equivalent to A1 0 0 A2 = . . . . . . 0 0
A

0 e2 ... e Ap p .. . 0 . . . 0 1
ri 2 i

e1

(B.9)

where

The matrix A is in block companion

Ai :=

0 0 . . . 0

1 0 . . . 0
1

0 1

(B.10)

0
ri 1 i

1 2 i i

form. Its eigenvalues are, therefore, the union of

the roots of the polynomials (B.3). By assumption all of those roots have strictly negative real parts. Therefore the origin of the error dynamics is exponentially stable. Exponential stability of the origin of the error dynamics implies that y (t) yd (t) goes to zero exponentially.

B.5

A Converse Theorem
The following theorem, from Khalil [Kha92] (Theorem 4.5, page 180), states that if

a nonlinear system has an exponentially stable equilibrium, then it has a Lyapunov function for a neighborhood of that equilibrium. Theorem B.5.1 For x Br Rn , Let x = 0 be an equilibrium of x = f (t, x)
1

(B.11)

See Horn and Johnson [HJ85], page 147 for a discussion of companion matrices, their properties, and their relation to polynomials.

254

Some Useful Theorems

Appendix B

where f : R+ Br Rn is C 1 in t and x, and f /x is bounded on Br uniformly in t. Let k, , and r0 be positive constants with r0 < r/k. For t0 0, assume that for each x(t0 ) in B r0 , x(t)
2

k x(t0 ) 2 e (tt0 ), t t0 ,

(B.12)

i.e. the origin of (B.11) is exponentially stable uniformly in t. Then there is a function V : R+ Br R, and positive constants c1 , c2 , c3 , and c4 such that c1 x
2 2

V (t, x) c2 x

2 2 2 2

(B.13) (B.14) (B.15)

V V + f (t, x) c3 x t x V c4 x 2 x 2

If f is independent of t, then V can be chosen to be independent of t. Proof: See Khalil [Kha92], Theorem 4.5, pages 180-183.

B.6

Uniform Ultimate Boundedness


The following denition is from [Kha92].

Denition B.6.1 The solutions of x = f (t, x) are said to be uniformly ultimately bounded if there exist constants b > 0 and c > 0 such that for each a (0, c) and each t0 R+ , there exists a T = T (a) > 0 such that x(t0 ) < a x(t) < b, t > t0 + T (B.16)

Regarding Denition B.6.1 we have the following theorem, also from Khalil [Kha92]. Theorem B.6.2 Uniform Ultimate Boundedness. For x Br Rn , let f : R+ Rn t. Let i (), i 3 be class K functions such that for each x Br , 1 ( x ) V (t, x) 2 ( x ) V V + f (t, x) 3 ( x ), x > 0 t x

be piecewise continuous in t and locally Lipshitz in x. Let V : R+ Br R be C 1 in x and (B.17) (B.18)

Sec. B.6

Uniform Ultimate Boundedness

255

1 for all t 0, with < 2 (1 (r )). Then there exists a class KL function (, ) and a 1 nite time t1 (dependent on x(t0 ) and ) such that for each x0 < 2 (1 (r )),

x(t) ( x(t0 ) , t t0 ), t0 t < t


1 x(t) 1 (2 ()), t t1

(B.19) (B.20)

Also, if i (r ) = ki r c , for ki > 0 and c > 0, then (r, s) = kr exp(s) with k = (k2 /k1 )1/c and = (k3 /k2 c). Proof: See Khalil [Kha92], Theorem 4.10, page 202.

256

Appendix C

Partial Feedback Linearization of Nonlinear Control Systems


In this section, through state dependent coordinate and input transformations we will put a nonlinear time-invariant control system of the form x = f (x) + g (x)u y = h(x) (C.1)

The state x is assumed to be in Rn . The column vector elds f (x) and gi (x), i p are linearization procedure from [Isi89] for autonomous systems will be used. degree [Isi89] r at a point x0 if

into a form which displays a linear relationship between input u Rp and output y Rp.

assumed to be smooth on an open set of Rn , as are the functions hi (x), i p. The standard Let r := [r1 , . . . , rp], with integers ri 1. A system (C.1) has vector relative

i. for all i, j satisfying 1 i p, 1 j p, and for all k < ri 1, and for all x in a neighborhood of x0 , Lg j Lk f hi (x) = 0 ii. the p p decoupling matrix (C.2)

is nonsingular for for all x in a neighborhood of x0 .

Lg Lr2 1 h2 (x) 1 f (x) := . . . . . . rp 1 Lg1 Lf hp (x)

1 1 Lg 1 Lr h1 (x) f

1 1 Lg p Lr h1 (x) f 2 1 Lg p Lr h2 (x) f . . .

(C.3)

Lgp Lfp

r 1

h1 (x)

Appendix C

Partial Feedback Linearization of Nonlinear Control Systems

257

Assume that (C.1) has well-dened vector relative degree r . This implies that if we successively dierentiate yi (t) with respect to t, then a component uj (t) of the input
th derivative of y . We use this insight as follows: u(t) appears for the rst time at the ri i

Dene the partial coordinate change


k1 i k = (x) := Lf h(x), i p, k ri.

(C.4)

completes the partial coordinate change (C.4), and such that (, ) = (x) := ( (x), (x)) satises (0, 0) = 0, and | det((x))| > 0 Let
1 1 1 2 . . . 1 r 1 2 1 . . . 2 r 1 . . . p r p

Let i (x), i p be any smooth functions of x, with i = i (x), i {1, . . ., n p}, which

(C.5)

Let nr :=

ip ri.

with i (, ) R, i (, ) R1p , q1 (, ) Rnnr , and q2 (, ) R(nnr )p . Note that 1 (, ) 2 (, ) 1 ( (, )) = (, ) = (C.8) . . . p (, ) By the assumption of vector relative degree (1 (, )) is nonsingular. Let (, ) := u := (, )1(v (, )) [i , . . . , nnr ]T (, ). Then dening the control law (C.9)

In the new coordinate system (C.1) takes the form j = j + 1, j ri 1, ip i i ri = i (, ) + i (, )u, i p i = q1 (, ) + q2 (, )u y = 1, i p


i i

:=

y 1 . . . (r1 1) y1 y = 2 . . . (r 1) y 2 2 . . . (rp 1) yp

y1

(C.6)

(C.7)

258

Partial Feedback Linearization of Nonlinear Control Systems

Chap. C

puts our system in the form j i ri


i

where

= s1 (, ) + s2 (, )v y = 1, i p i i s1 (, ) := q1 (, ) (, )1(, )

= vi , i p

= ij + 1, j ri 1, i p

(C.10)

(C.11)

and s2 (, ) := q2 (, ) (, )1 (C.12)

The part of the dynamics of system (C.10) are in the form of p integrator chains, each of length ri , i p. The dynamics of (C.10) are the unobservable, internal dynamics of (C.1).

Вам также может понравиться