Vectors
a b = dibi -f- (I2&2"I- C&3&3
a x b =
i
j
k
a 1 &2 as
61 62 63
i j -
dy
+ k
d
cte
V / = grad / = / ,)
V v = div v = ux + vy + w.
f x = p sin <t>cos 0
i
j
k
s y = p sin <sin 0
V X v = curl v = d/ dx d/dy d/az
i 2 = p COS 0
U V w
A Second Course in
Calculus
A Second Course in
Calculus
Harl ey Flanders Tel Aviv University
Robert R. Korfhage Southern Methodist University
JllStin J . Price Purdue University
ACADEMIC PRESS
New Yor k San Francisco London
A Subsidiary of Harcourt Brace Jovanovich, Publishers
Contents
Pref ace xi
1. INFINITE SERIES AND INTEGRALS
1. Infinite Series 1
2. Convergence and Divergence 4
3. Tests for Convergence 8
4. Series with Positive and Negative Terms
5. Improper Integrals 16
6. Convergence and Divergence 23
7. Relation to Infinite Series 30
8. Other Improper Integrals 35
9. Some Definite Integrals [optional] 40
10. Stirlings Formula [optional] 43
2. TAYLOR APPROXIMATIONS
1. Introduction 52
2. Polynomials 52
3. Taylor Polynomials 56
4. Applications 63
5. Taylor Series 67
6. Derivation of Taylors Formula 69
3. POWER SERIES
1. Introduction 72
2. Ratio Test 78
3. Expansions of Functions 82
4. Further Techniques 90
vi Contents
5. Binomial Series 97
6. Alternating Series 102
7. Applications to Definite Integrals [optional] 105
8. Uniform Convergence [optional] 109
4. SOLID ANALYTIC GEOMETRY
1. Coordinates and Vectors 116
2. Length and Dot Product 123
3. Lines and Planes 130
4. Linear Systems and Intersections 133
5. Cross Product 141
6. Applications 147
5. VECTOR CALCULUS
1. Vector Functions 157
2. Space Curves 161
3. Curvature 168
4. Velocity and Acceleration 174
5. Integrals 181
6. Polar Coordinates 188
7. Polar Velocity and Acceleration [optional] 193
6. FUNCTIONS OF SEVERAL VARIABLES
1. Introduction 197
2. Domains 199
3. Continuity 205
4. Graphs 209
5. Partial Derivatives 213
6. Maxima and Minima 217
7. LINEAR FUNCTIONS AND MATRICES
1. Introduction 224
2. Linear Transformations 227
3. Matrix Calculations 232
Contents
4. Applications 239
5. Quadratic Forms 244
6. Quadric Surfaces 253
7. Inverses 261
8. Characteristic Roots [optional] 269
8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
1. Differentiable Functions 281
2. Chain Rule 287
3. Tangent Plane 292
4. Gradient 297
5. Directional Derivative 302
6. Applications 306
7. Implicit Functions 310
8. Differentials 316
9. Proof of the Chain Rule [optional] 319
9. HIGHER PARTIAL DERIVATIVES
1. Mixed Partials 326
2. Higher Partials 328
3. Taylor Polynomials 333
4. Maxima and Minima 339
5. Applications 345
6. Three Variables 349
7. Maxima with Constraints [optional] 352
8. Further Constraint Problems [optional] 360
10. DOUBLE INTEGRALS
1. Introduction 369
2. Special Cases 372
3. Iterated Integrals 377
4. Applications 382
5. General Domains 387
6. Polar Coordinates 399
11. MULTIPLE INTEGRALS
1. Triple Integrals 409
2. Cylindrical Coordinates 419
viii Contents
3. Spherical Coordinates 427
4. Center of Gravity 440
5. Moments of Inertia 448
12. INTEGRATION THEORY
1. Introduction 457
2. Step Functions 458
3. The Riemann Integral 463
4. Iteration 472
5. Change of Variables 482
6. Applications of Integration 489
7. Improper Integrals [optional] 502
8. Numerical Integration [optional] 511
13. DIFFERENTIAL EQUATIONS
1. Introduction 517
2. Separation of Variables 520
3. Linear Differential Equations 526
4. Homogeneous Equations 530
5. Non- Homogeneous Equations 532
6. Applications 540
7. Approximate Solutions 549
14. SECOND ORDER EQUATIONS AND SYSTEMS
1. Linear Equations 556
2. Homogeneous Equations 557
3. Particular Solutions 562
4. Applications 565
5. Power Series Solutions 576
6. Matrix Power Series 580
7. Systems 585
8. Uniqueness of Solutions [optional] 591
15. COMPLEX ANALYSIS
1. Introduction 596
2. Complex Arithmetic 598
3. Polar Form 602
Contents
4. Complex Exponentials 610
5. Integration and Differentiation 615
6. Applications to Differential Equations 619
7. Applications to Power Series 622
Mathematical Tables
1. Trigonometric Functions 629
2. Trigonometric Functions for Angles in Radians 630
3. Four- Place Mantissas for Common Logarithms 632
4. Antilogarithms 634
5. Exponential Functions 636
Answers to Selected Exercises 643
Index 683
Preface
This text, designed for a second year calculus course, can follow any
standard first year course in one- variable calculus. Its purpose is to cover
the material most useful at this level, to maintain a balance between theory
and practice, and to develop techniques and problem solving skills.
The topics fall into several categories:
Inf inite series and integrals
Chapter 1 covers convergence and divergence of series and integrals. It
contains proofs of basic convergence tests, relations between series and
integrals, and manipulation with geometric, exponential, and related series.
Chapter 2 covers approximation of functions by Taylor polynomials, with
emphasis on numerical approximations and estimates of remainders. Chap
ter 3 deals with power series, including intervals of convergence, expan
sions of functions, and uniform convergence. It features calculations with
series by algebraic operations, substitution, and term- by- term differentia
tion and integration.
Vector methods
Vector algebra is introduced in Chapter 4 and applied to solid analytic
geometry. The calculus of one- variable vector functions and its applications
to space curves and particle mechanics comprise Chapter 5.
Linear algebra
Chapter 7 contains a practical introduction to linear algebra in two and
three dimensions. We do not attempt a complete treatment of foundations,
but rather limit ourselves to those topics that have immediate application
to calculus. The main topics are linear transformations in R2and R3, their
matrix representations, manipulation with matrices, linear systems, quad
ratic forms, and quadric surfaces.
Dif f erential calculus of several variables
Chapter 6 contains preliminary material on sets in the plane and space,
and the definition and basic properties of continuous functions. This is fol
lowed by partial derivatives with applications to maxima and minima.
Chapter 8 continues with a careful treatment of differentiability and appli
cations to tangent planes, gradients, directional derivatives, and differentials.
Here ideas from linear algebra are used judiciously. Chapter 9 covers higher
xii Preface
order partial derivatives, Taylor polynomials, and second derivative tests
for extrema.
Multiple integrals
In Chapters 10 and 11 we treat double and triple integrals intuitively,
with emphasis on iteration, geometric and physical applications, and co
ordinate changes. In Chapter 12 we develop the theory of the Riemann
integral starting with step functions. We continue with Jacobians and the
change of variable formula, surface area, and Greens Theorem.
Dif f erential equations
Chapter 13 contains an elementary treatment of first order equations,
with emphasis on linear equations, approximate solutions, and applications.
Chapter 14 covers second order linear equations and first order linear sys
tems, including matrix series solutions. These chapters can be taken up
any time after Chapter 7.
Complex analysis
The final chapter moves quickly through basic complex algebra to com
plex power series, shortcuts using the complex exponential function, and
applications to integration and differential equations.
Features
The key points of one- variable calculus are reviewed briefly as needed.
Optional topics are scattered throughout, for example Stirlings Formula,
characteristic roots and vectors, Lagrange multipliers, and Simpsons Rule
for double integrals.
Numerous worked examples teach practical skills and demonstrate the
utility of the theory.
We emphasize simple line drawings that a student can learn to do himself.
Acknowl edgment s
We appreciate the invaluable assistance of our typists, Sara Marcus
and Elizabeth Young, the high quality graphics of Vantage Art Inc., and
the outstanding production job of the Academic Press staff.
1. Infinite Series and Integrals
1. INFINITE SERIES
One of the most important topics in mathematical analysis, both in theory
and applications, is infinite series. The basic problem is how to add up a sum
with infinitely many terms. At first that seems impossible; life is too short.
However, suppose we look at the sum
1 1 1
x + 2 + i + + 2 +
and start adding up terms. We find 1, , i , ^ >numbers getting closer
and closer to 2. The message is clear: in some limit sense the total of all the
terms is 2.
If we try to add up terms of the sum
1+ 1+ 1+ ,
we find 1, 2, 3, 4, , numbers becoming larger and larger. The situation is
hopeless; there is no reasonable total.
Let us now consider in some detail two important infinite sums.
Geometric Series
A geometric series is an infinite sum in which the ratio of any two con
secutive terms is always the same:
a + ar + ar2+ + arn + (a ^ 0, r 5^0).
Let sndenote the sum of all terms up to arn,
sn = a + ar + ar2+ + arn.
If r = 1, then sn = a + a + + a = (n + 1) a, so sn- - - - - zi= <*>. If
r ^ 1, there is a simple formula for sn:
/ l - rn+1\
Sn = 0(1+ r + r2+ + rn) = a ( j .
(To check, multiply both sides by 1 r.) If \r\ < 1, then rn+1- - - - - >0 as n
increases. Hence a logical choice for the sum of the geometric series is
2 1. INFINITE SERIES AND INTEGRALS
a/ (1 r). But if \r\ > 1, then rn+1grows beyond all bound, and the situation
is hopeless. If r = 1, then snis alternately a and 0. There is no reasonable
sum in this case either.
An infinite geometric series
a + ar + ar2+ + arn +
has the sum a/ (1 r) if \r\ < 1, but no sum if \r\ > 1.
Harmonic Series
The series
1 1 1
1+ - + - + + - +
2 3 n
is known as the harmonic series. It is not at all obvious, but the sums sn =
1+ i + J + *# + n~xincrease beyond all bound, so the series has no sum.
To see why, we observe that
Si =1>i
1 1 1 2
S2= Si + + - - ,
/ I 1\ / I 1\ 2 1 3
= 82+ (3+ i ) > S2+ ( l + l ) > 2 + 2 " 2
/ l 1 1 1\ / l 1 1 1\ 3 1 4
S8 = S4+V5 + 6 + 7 + 8 ) >S4 + (8 + 8 + 8 + 8/) > 2 + 2 " 2 -
Similarly, su > 5/ 2, S32> 6/ 2, , s2 > (n + 1)/ 2. Now the sequence of
sums sn increases, and our estimates show sn eventually passes any given
positive number. (This happens very slowly it is true; around 215terms are
needed before snexceeds 10 and around 229terms before it exceeds 20.)
Remark: Both the geometric series for 0 < r < 1 and the harmonic
series have positive terms that decrease toward zero, yet one series has a sum
and the other does not. This indicates the subtlety we must expect in our
further study of infinite series.
EXERCISES
Find the sum:
L1 + + p+ + ^ 2- 1 _ ^+l _ + h
1. Infi ni te Series 3
32 33 Sn+1
5. 3+ - + ^ H- - - - - h 6. 1- y2+ y*~ -\- - - - - - + y*>
7. r1'2+ r + r3'2H- - - - - 1- r4 8. (z + 1) + (x+ 1)H- - - - - h (z + l )5.
Find the sum of the series:
9- 1- ? + (1 Y - (1 Y + io. J - 7 + S - A +
5 W W ' 2 4 1 8 16
1L 2*0+^1 + ^2+ 12- + ^ + i + '
13. _- - - 1- - - - \- - - - 1- - - - 1- - - - 1- ...
2 + x2^ (2 + x2)2^ (2 + x2)3^
. . cos 6 , cos28 . cos36 .
14 . - - - - - - - - - - - - h .
2 4 8
15. A certain rubber ball when dropped will bounce back to half the height from
which it is released. If the ball is dropped from 3 ft and continues to bounce
indefinitely, find the total distance through which it moves.
16. Trains A and B are 60 miles apart on the same track and start moving toward
each other at the rate of 30 mph. At the same time, a fly starts at train A and
flies to train B at 60 mph. Then it returns to train A, then to B, etc. Use a geo
metric series to compute the total distance it flies until the trains meet.
17. (cont.) Do Ex. 16 without geometric series.
18. A line segment of length L is drawn and its middle third is erased. Then (step 2)
the middle third of each of the two remaining segments is erased. Then (step 3)
the middle third of each of the four remaining segments is erased, etc. After
step n, what is the total length of all the segments deleted?
Interpret the repeating decimals as geometric series and find their sums:
19. 0.11111- 20. 0.101010-
21. 0.434343- - - 22. 0.185185185- - -.
Show that the series have no sums:
23. - + - + - + - + 24. 1+ - +- +- + .
2 4 6 8 3 5 7
25. Find n so large that
26. Aristotle summarized Zenos paradoxes as follows:
I cant go from here to the wall. For to do so, I must first cover half the
distance, then half the remaining distance, then again half of what still
remains. This process can always be continued and can never be completed.
Explain what is going on here.
2. CONVERGENCE AND DIVERGENCE
It is time to formulate the ideas of Section 1 more precisely.
4 I. INFINITE SERIES AND INTEGRALS
An infinite series is a formal sum
dl + 0,2 + ds +
Associated with each infinite series is its sequence {sn} of partial sums
defined by
S i = d i , $2 = d i + 02, , sn = d i + a 2 + + d n . A series converges to the number S, or has sum ,' if limn_* sn = S. A series diverges, or has no sum, if lim^*, sndoes not exist. A series that converges is called convergent; a series that diverges is called divergent. Let us recall the meaning of the statement lining sn = S. Intuitively, it means that as N grows larger and larger, the greatest distance \sn$|, for all
n > N, becomes smaller and smaller. Precisely, for each e > 0, there is a
positive integer N such that |sn S | < efor all n > N. Let us rephrase the
definition of convergence accordingly.
The infinite series ai + a2+ a3+ converges to S if for each e > 0,
there is a positive integer N such that
|(ai + a2+ + an) S\ < e
whenever n > N.
Thus, no matter how small e, you will get within e of S by adding up enough
terms. For each e, the N tells how many terms are enough. Naturally the
smaller eis, the larger N will be. From the way convergence is defined, the
study of infinite series is really the study of sequences of partial sums. Hence we
may apply everything we know about sequences.
We know that inserting, deleting, or altering any finite number of elements
of a sequence does not affect its convergence or divergence. The same holds for
series. For instance, if we delete the first 10 terms of the series ai + a2+ a3+
, then we decrease each partial sum sn (for n > 10) by the amount ai +
a2+ + ai0. If the original series diverges, then so does the modified series.
If it converges to S, then the modified series converges to aS (ai + a2+
+ aio).
Wa r n i n g : In problems where we must decide whether a given infinite
series converges or diverges, we shall often, without prior notice, ignore or
change a (finite) batch of terms at the beginning. This, we now know, does not
affect convergence.
2. Convergence and Divergence 5
Notati on
The first term of a series need not be a\. Often it is convenient to start with
a0or with some other a&.
It is also convenient to use summation notation and abbreviate ai + a2+
3+ by J2n=i an, and even simply J2 an- In summation notation, the
partial sums snof an infinite series ]Cn=i are given by
n
k= 1
Cauchy Cri teri on
Recall the Cauchy criterion for convergence of sequences:
A sequence {sn}converges if and only if for each e > 0, there is a positive
integer N such that
whenever m,n > N.
Thus all elements of the sequence beyond a certain point must be within e of
each other. The advantage of the Cauchy criterion is that it depends only on
the elements of the sequence itself; you dont have to know the limit of a
sequence in order to show convergence. Thats a great help; sometimes it is
very hard to find the exact limit of a sequence, whereas you may only need to
know that the sequence does indeed converge to some limit.
Let us apply the Cauchy criterion to the partial sums of a series. We simply
observe (for m > n) that
sm sn = (ai + a2+ + an+ an+1+ + am) (i + a2+ + an)
= &n+i + an+2+ + am.
Cauchy Test An infinite series X an converges if and only if for each
e > 0, there is a positive integer N such that
\a>n+l + &n+2+ ' + Om\ <
whenever m > n > N.
Thus beyond a certain point in the series, any block of consecutive terms, no
matter how long, must have a very small sum.
In the last section we proved the harmonic series diverges by producing
blocks of terms arbitrarily far out in the series whose sum exceeds J. In other
words, we showed that the Cauchy test fails for = J.
Suppose the Cauchy test is satisfied, and take m = n + 1. Then the block
consists of just one term am, so \am\ < e when m > N. In other words,
dm *0.
Necessary Condition for Convergence If the series X anconverges, then
lim^oo dn = 0.
Wa r n i n g : This condition is not sufficient for convergence. The har
monic series 1+ \ + J + diverges even though l/ n - - - - - 0.
Positive Terms
Suppose an infinite series has only non- negative terms. Then its partial
sums form an increasing sequence, S1<S2<S3<S4< . Recall that an
increasing sequence must be one of two types: Either (a) the sequence is
bounded above, in which case it converges; or (b) it is not bounded above, and
it marches off the map to + 00.
We deduce corresponding statements about series:
6 1. INFINITE SERIES AND INTEGRALS
A series i + &2+ 3+ with an > 0 converges if and only if there
exists a positive number M such that
d\ + a2+ + dn < M for all n > 1.
Using this fact, we can often establish the convergence or divergence of a
given series by comparing it with a familiar series.
Comparison Test Suppose 2 and X bn are series with non- negative
terms.
(1) If X) an converges and if bn < dn for all n > 1, then X bn also con
verges.
(2) If J2 diverges and if bn > dnfor all n > 1, then X) bnalso diverges.
Proof : Let snand tndenote the partial sums of X) and X bnrespec
tively. Then {sn}and {tn}are increasing sequences.
(1) Since X an converges, sn < Xi dn = M for all n > 1. Since bk < dk
for all k, we have tn < s for all n. Hence tn < sn < M for all n > 1, so 6
converges.
(2) Since X andiverges, the sequence {sn}is unbounded. Since bk > dk, we
have tn > sn. Hence {^n}is also unbounded, so X bndiverges.
No t e: It is important to apply the Comparison Test correctly. Roughly
speaking, (1) says that smaller than small is small and (2) says that
bigger than big is big. However the phrases smaller than big and
bigger than small contain little useful information.
2. Convergence and Divergence 7
EXAMPLE 2.1
Test for convergence or divergence:
'> > 2 * X r f r ,
Solution: (a) (sin2n)/ 3n < 1/ 3W. But l/ 3nconverges, so the given
series converges.
(b) 1/ y/ n > 1/ n. But 1/ n diverges, so the given series diverges.
(c) Diverges because a = n/ (2n + 1 )- - - - - >1 ^0 .
Answer: (a) converges (b) diverges (c) diverges.
p-Ser/es
The comparison test is useful provided you have a good supply of known
series. An excellent class of series for comparisons are those of the form
l/ np.
The series l/ npdiverges if p < 1 and converges if p > 1.
Proof : If 0 < p < 1, then \ / nv > l/ n and the series diverges by com
parison with the divergent series l /n.
If p > 1, we shall show that the partial sums of the series are bounded. We
use an important trick: we interpret snas an area and compare it with a region
below the curve y = l/ xp. See Fig. 2.1.
Fi g. 2.1 Area under the curve exceeds the rectangular sum.
8 1. INFINITE SERIES AND INTEGRALS
The combined areas of the rectangles shown is less than the area under the
decreasing curve between x = 1 and x = n. Therefore
f ndx _ - 1 1
n - L 1
J l xp p 1xp~l
1 V - 1
Ik np- 7
Since p 1 > 0, the right side is a positive number, a little less than 1/ (p 1)
for all values of n. Hence
Sn = 1 + ( ~ + + * H ) < 1 H- - - - - :
\ 2P 3P nv) p 1
for n > 1. Thus the partial sums are bounded if p > 1, so the series converges.
EXERCISES
Determine whether the series converges or diverges:
n 2 + 1 2n\ / n
3 Y - 5 4 V ____l
L j 4n + 3 # (2n - 1
^ ^ ny/ n +3 ^
r. V n2 s V
/ j/ 2n4 7 / j/ In n
*2? X
)2
+
(n+ 1) (n + 3) (n + 5)
11. Show that 2 ^! 1/n2< 2. [Hint: See text.]
12. Show that 1/w! < 3. Compare n! with 2n.]
13. Prove that if and converge, then so does (an + bn), and find the
sum.
14. If anand bndiverge, show by examples that (an + bn) may either con
verge or diverge.
Let anbe a convergent series of positive terms:
15. Prove that an2converges.
16. Show by examples that y/ommay either converge or diverge.
3. TESTS FOR CONVERGENCE
Suppose anis a given series, and c 5* 0. Then the two series an and
caneither both converge or both diverge. For the partial sums of the series
are {sn}and {csn}. Clearly these sequences converge or diverge together.
3. Tests for Convergence 9
We can extend these remarks to a pair of series anand bnwhere the
ratios bn/ anare not constant, but restricted to a suitable range. Throughout the
rest of this section all series will have only positive terms.
Let Yj anand bnbe given series with positive terms. Suppose there exist
positive numbers c and d such that
^ bn . ,
c < < d
an
for all sufficiently large ft. Then the series both converge or both diverge.
Proof : If an converges, then so does dan. But bn < dan, so bn
converges.
If an diverges, then so does can. But bn > can, so bn diverges.
Done.
The conditions of the preceding test are automatically satisfied if the ratios
bn/ anactually approach a positive limit L. Then, by the definition of limit with
e = JL, all ratios satisfy \ L < bn/ an < L, except perhaps for a finite number
of them.
Let an and bn have positive terms. If lim bn/ an = L exists and if
L > 0, then either both series converge or both series diverge.
EXAMPLE 3.1
Test for convergence or divergence;
w 2 ^ <b>
Solution: (a) When n is very large, n is much larger than y/ n. This
suggests that the terms behave roughly like 1/ n, so we apply the test with
an = 1/ n and bn = l/ (f t+ \ / n):
bn ft 1 1 ,
= ----------7=- = --------- ;---t=---------->-------- = 1, as f t--------->00.
an ft + V f t 1 + 1/ y/ n 1 +0
The ratios have a positive limit. Therefore bndiverges since 1/ ft diverges.
(b) When n is very large, the terms appear to behave like 4ft/ 3ft3= 4/ 3ft2.
This suggests comparison with the convergent series V^2- Let an = 1/ ft2
and bn = (4ft + 1)/ (3ft3ft2 1). Then
bn (4ft + l)ft2 4 + 1/ ft ______ 4
~an = 3ft3- ft2- 1 = 3 - 1/ ft - 1/ ft3 * 3
The ratios have a positive limit. Therefore bn converges because 1/ n2
converges.
Answer: (a) diverges (b) converges.
10 1. INFINITE SERIES AND INTEGRALS
The Ratio Test
In a geometric series, the ratio an+i/ an is a constant, r. If |r| < 1, the
series converges, basically because its terms decrease rapidly. By analogy, we
should expect convergence in general if the ratios are small, not necessarily
constant.
Let anbe a series of positive terms.
(1)
The series converges if
an+l . 1
-- < r < 1
dn
from some point on, that is for n > N.
(2)
The series diverges if
an+l ^ ^
from some point on.
Proof: (1) Suppose an+i/ an < r < 1 starting with n = N. Then
un+i < Gtfr, cl n+2< ctN+ir < aivT2,
and by induction, aw+k < clntk, that is, an < aNrn~N = (aNr~N)rn for all
n > N. It follows that the series an converges by comparison with the
convergent geometric series rn.
(2) From some point on, an+i > an. The terms increase, hence the series
diverges.
Warni ng: Note that the test for convergence requires an+i/an < r < 1,
not just an+i/an < 1. The ratios must stay away from 1. If an+i/an < 1 but
an+i/ an- - - - - 1, we may have divergence. For example, take an = 1/ n.
Then an+1/ an = n/ (n + 1) = 1 l/ (n + 1) < 1, but 1/ n diverges.
It often happens that the ratios an+i/a,n approach a limit.
Ratio Test Let anbe a series of positive terms. Suppose an+i/an--------->r.
(1)
The series converges if r < 1.
(2)
The series diverges if r > 1.
(3)
If r = 1, the test is inconclusive; the series may either converge or
diverge.
3. Tests for Convergence 11
Proof : (1) If r < 1, choose eso small that r + e < 1. By definition of
the statement an+i/ an- - - - - r, there is a positive integer N such that
an+i/ an < r + e < 1 for all n > N. Therefore the series converges by the
preceding test.
(2) Similarly, if r > 1, then an+i/ an > 1 from some point on. The series
diverges.
(3) If r = 1, this test cannot distinguish between convergent and di
vergent series. For example, take an = 1/ nv. The series converges for p > 1,
diverges for p < 1. But for all values of p,
On+l
dn
nv
(n + 1)p \ n + 1/ \ n + 1/
(1 - 0)* = 1.
EXAMPLE 3.2
Test for convergence or divergence:
(a)
l l < 1 %.
Solution: (a) Set an = n/ 2n. Then
CLn+1 _ n + 1 / n _ n + 1 _ 1 / 1\
an ~ 2n+1 / 2" 2n 2 \ n/
Since | < 1, the series converges by the ratio test.
(b) Set an 10n/ n\ . Then
Cln+l = 10n+1
an 1*2* n(n + 1)
j 10
/ 1*2n
10
n
- >0.
Since 0 < 1, the series converges by the ratio test.
Answer; (a) converges (b) converges.
EXERCISES
Test for convergence or divergence:
1
' I
' I
1
2.
V 1
n2- 3 / ^\ / 2n3 n
1
4.
V* 5 + \ / n
4n 1
A/ 1+ n
n3
n\
6.
HO"
ne~n 8.
Y 3- +1
/ j 5e + n
12 J. INFINITE SERIES AND INTEGRALS
9 V 10 V
Z/ 3- n Z/ On n)"
i i V w! 12 V i*!Z.
Z/ 1-3-5 (2n 1) Z/ (2w)!'
'ind all real numbers z for which the series converges:
X
x2n 1 sin2nx
s' l i L,
15. ^ (3x)2n 16. ^ m:2".
17. (Root Test ) If an > 0 and if y/cin < r < 1 for n > 1, show that an con
verges.
18*. Let an and bn be series with positive terms. If an converges, and if
bn+i/bn< an+i/ an, show that also converges.
19. Let an and 6nbe series with positive terms. Suppose bn/ an - - - - 0. Find
an example where 6n converges while an diverges. Does this contradict
the text?
4. SERIES WITH POSITIVE AND NEGATIVE TERMS
Infinite series with both positive and negative terms are generally more
complicated than series with terms all of the same sign. In this section, we
discuss two common types of mixed series that are manageable.
Alternating Series
An alternating series is one whose terms are alternately positive and
negative. Examples:
1 1 1
1 _ 2 + 3 _ i + _ +
1 x2+ x* x6H- - - - 1- - - (alternating for all x ^ 0),
rp2 ^.3 4
x + H- - - - 1- - - - (alternating only for x > 0).
Such series have some extremely useful properties, two of which we now
state.
(1) If the terms of an alternating series decrease in absolute value to
zero, then the series converges.
(2) If such a series is broken off at the n- th term, then the remainder
(in absolute value) is less than the absolute value of the in + l)- th
term.
4. Series with Positive and Negative Terms 13
These assertions provide a very simple convergence criterion and an im
mediate remainder estimate for alternating series. We shall not give a formal
proof; rather we shall show geometrically that they make good sense. How
ever, a proof is outlined in Exs. 15 and 16.
Suppose an is an alternating series whose terms decrease in absolute
value to zero. (To be definite, assume a\ > 0.) Let sn = i + a2+ + an.
Plot these partial sums (Fig. 4.1). The partial sums oscillate back and forth
as shown. But since the terms decrease to zero, the oscillations become shorter
and shorter. The odd partial sums decrease and the even ones increase,
squeezing down on some number S. Thus, the series converges to S.
-- - - - - - - - - - - - - - - - - - - - - |a2| - - - - - - - - - - - - - - - - - - - - - - -
-------------------------------------- \a3\ -------------------------------------
- - - - - - - - - - - - M - - - - - - - - - - - -
-*- - - - |a5| - - - - -
I- - - - - - - - - - 1- - - - 1-- 1- - - - - - - 1- - - - - - - - - - - - - - 1- - - - - - - - 1
s 2 4 se S s * s 3 Si = ai
Fi g. 4.1 partial sums of an alternating series
If the series is broken off after n terms, the remainder is |S sn|. But
from Fig. 4.1,
\ S Sn\ < Sn\ = |dn_j_i|.
Thus, the remainder is less than the absolute value of the (n + l)- th term.
EXAMPLE 4.1
Find all values of x for which the series
converges.
Solution: From the ratio test, it is easily seen that the series con
verges for |#| < 1 and diverges for |#| > 1. But what happens if \x\ = 1 ?
At x = 1, the series is
1 1 1
1 _ 1 + 2 ~3 + i ~ + " '
an alternating series whose terms decrease in absolute value to zero. Such a
series is guaranteed to converge by the statement above.
Answer: Converges for \x\ < 1.
No t a t i o n : A useful device for abbreviating alternating series is use of
14 1. INFINITE SERIES AND INTEGRALS
the factor ( l ) n, an automatic sign reverser. For example, the alternating
harmonic series can be written (1)n- 1/ n.
EXAMPLE 4.2
It is known that the series 0xn/ n\ converges to exfor all real x.
Use this series to estimate 1/e to 3- place accuracy.
Solution: Set x = 1. Then e~l = ^=0 (~ 1)n/ n\ . The signs alternate
and the terms decrease in absolute value to 0. Therefore,
1 1 ( ~ l ) n . ,
e = 1 + - - - h H- - - - ;- - - h remainder,
1! 2! n\
where
| remainder | < - 7- .
(n + 1)!
For 3- place accuracy, we need |remainder| < 5 X 10~4, so we want an n
for which
, I , < 5 X 10- 4, (n + 1)! > I X 104= 2000.
(n + 1)! 5
Now 6! = 720 and 7! = 5040. So we choose n + 1 = 7, that is, n = 6.
Answer: l l + i -| + ^r 0.368.
Absolute Convergence
How is it that the harmonic series 1/ n diverges but the alternating
harmonic series (l ) n~1/ n converges? Essentially the harmonic series
diverges because its terms dont decrease quite fast enough, like 1/ n2or 1/ 2W
for example. Its partial sums consist of a lot of small terms which have a large
total. The terms of l/ 2n, however, decrease so fast that the total of any
large number of them is bounded.
The alternating harmonic series converges, not by smallness of its terms
alone, but also because strategically placed minus signs cause lots of cancella
tion. Just look at two consecutive terms:
f - - 1 _ 1
n n + 1 n(n + 1)
Cancellation produces a term of a convergent series! Thus ( 1)n~l/ n
converges because its terms get small and because a delicate balance of positive
and negative terms produces important cancellations.
Some series with mixed terms converge by the smallness of their terms
alone; they would converge even if all the signs were +. We say that a series
an converges absolutely if \an\ converges. As we might expect, absolute
convergence implies (is even stronger than) convergence.
4. Series with Positive and Negative Terms 15
If a series anconverges absolutely, then it converges.
Proof : Suppose M converges. By the Cauchy test, for each e > 0
there is an N such that
\an+i\ + \an+2\ + + \dm\ < e, m > n > N.
But
| f l + l + d n + 2 + * * * + dm\ < |n+ l | + * * * + \dm\ <
by the triangle inequality. Therefore anconverges by the Cauchy test.
Rema r k : In studying series with mixed terms, it is a good idea to
check first for absolute convergence. Just change all signs to +, then use any
test for convergence of positive series.
EXAMPLE 4.3
Test for convergence and absolute convergence:
(a) 1 -j- -f- -f~ -)- -f~ *
w 22 32 42 52 62
Solution: (a) The series of absolute values is V nS which con
verges. The series is absolutely convergent, hence convergent.
(b) The series of absolute values is 1/ y/ n which diverges. Hence the
given series does not converge absolutely. It does converge, nevertheless,
because it satisfies the test for alternating series: the terms decrease in absolute
value to zero.
Answer: (a) converges and converges absolutely,
(b) converges but not absolutely.
EXERCISES
Test the series for convergence and for absolute convergence:
1.
/ { in n / j
In n
LX ' - " c r . 2 - X
3X <_1)" 5^ 4' 2
<- i ) " ,
16 1. INFINITE SERIES AND INTEGRALS
7. V (- 1) * 8. V ( 1)Bsin
/ ( n + inn n
9. y ^ io. y ( - 1)<+/ *, ' . .
/_! n2 Z/ Ve~ +
Test for convergence or divergence:
iLi+H +i + H ++- - -
12 1+ l " ~ 5 + + - - - - - -
13. Estimate l/ \ / e to 4- place accuracy by using the technique of Example 4.2.
14*. Suppose the series anXn converges for the positive value x0. Show that the
series converges absolutely for \x\ < x0.
15*. Suppose an is an alternating series whose terms decrease in absolute value
toward 0. Suppose the first term is a\ > 0. If {sn}denotes the sequence of partial
sums, show that the subsequence {s2n}is increasing and bounded above and the
subsequence {s2n- i} is decreasing and bounded below.
16*. (cont.) Conclude that the two sequences coverge and have the same limit S.
Show that an= S.
17*. Suppose an and bn2both converge. Prove that anK converges absolutely.
18*. Suppose an converges, but not absolutely. Let and Ck be the series
made of the positive ans and negative ans respectively. Prove that both bj
and Ckdiverge.
5. IMPROPER INTEGRALS
In scientific problems, one frequently meets definite integrals in which
one (or both) of the limits is infinite. Here is an example.
Imagine a particle P of mass m at the origin. Consider the gravitational
potential at a point x = a due to P. This potential is the work required to
move a unit mass from the point x = a to infinity, against the force exerted
by P. According to Newtons Law of Gravitation, the force is km/ d2, where
d is the distance between the two masses and k is a proportionality constant.
The work done in moving the unit mass from x = a to x = b is
f b [ bkm ( 1\ 16
J (force) dx = J j d x = km ^- - - j | = km
c
Let b- - - - - >oo. Then 1/ 6- - - - - >0, hence
/ * * ! ? * ---------o ) - ^ .
Ja X \ a / a
Thus k m / a is the work required to move the mass from a to < . It is convenient
5. Improper Integrals 17
to set
6km km
ax =
A definite integral whose upper limit is oo, whose lower limit is oo, or
both, is called an improper integral
In order to give a precise definition of infinite integrals, we must recall what
is meant by an expression such as limx- *, F(x ) = L. The definition is pat
terned after the definition of the limit of a sequence.
Let F be defined for x > a, where a is some real number. Then
lim^oo F(x ) = L
if for each e > 0, there is a number b such that
\ F(x) L\ < e for all x > b.
For increasing and decreasing functions the basic fact is analogous to the
one for sequences. (We say F is increasing provided F(x i) < F(x2) whenever
Xi < x2.)
Let F be an increasing function. Then lim*^ F(x ) exists if and only if
F(x) is bounded above, i.e., if and only if there exists a number M such
that
F(x ) < M
for all x > a, that is, on the domain of F.
Similarly if F(x ) is a decreasing function, then lim*- * F(x ) exists if and
only if F(x ) is bounded below.
We shall not give the proof; with some modification it is the same proof
as for sequences.
Another important fact, also similar in spirit to one for sequences, is the
Cauchy criterion:
Cauchy Criterion lim*- *, F (x) exists if and only if for each e > 0, there
exists b such that
Limits
|F(x ) - F (z )| <
whenever x > b and z > b.
There is a similar discussion for limits of the form lim*.*- *, F(x ) which is
18 1. INFINITE SERIES AND INTEGRALS
hardly worth writing down. Obviously, if we set G(x) = F ( x), then
limx^-oo F(x ) is equal to lim^* G(x), so limits at oo involve nothing new.
Defi nition o f Improper Integrals
Suppose f (x) is defined for x > a and is integrable on each interval
a < x < b for b > a. Set
provided the limit exists. If it does, the integral is said to converge, other
wise to diverge.
Similarly, define
Now linu^oo F(b) may or may not exist.
Define
provided both integrals on the right converge.
Rema r k : An integral from oo to oo may be split at any convenient
finite point just as well as at 0.
An improper integral need not converge. As an example, take
Since
and In b oo as b oo, the limit
does not exist; the integral diverges.
Remember that a definite integral of a positive function represents the
area under a curve. We interpret the improper integral
[ f (x)dx, f (x )> 0,
J a
as the area of the infinite region in Fig. 5.1. If the integral converges, the
area is finite; if the integral diverges, the area is infinite.
y
5. Improper Integrals 19
Fig. 5.1 area of infinite region
At first it may seem unbelievable that a region of infinite extent can
have finite area. But it can, and here is an example. Take the region under the
curve y = 2~x to the right of the ?/-axis (Fig. 5.2). The rectangles shown
in Fig. 5.2 have base 1 and heights 1, , i , i , Their total area is
1 1 1
1 + 2 + i + 8 + =2
Therefore, the shaded infinite region has finite area less than 2.
EXAMPLE 5.1
Compute the exact area of the shaded region in Fig. 5.2.
Fi g. 5.2
Solution: The area is given by the improper integral
f 2rx dx = lim f 2~xdx.
J 0 &-+00J 0
An antiderivative of 2~xis 2~x/ \ n 2. (Use 2~x = e~xln2.) Hence
20 1. INFINITE SERIES AND INTEGRALS
Rema r k : This answer is reasonable. A look at Fig. 5.2 shows that the
area of the shaded region is between 1 and 2. A closer look shows that the
area is slightly less than 1.5. Why?
Solution:
b
= arc tan x = arc tan 6.
o
Let b oo. Then arc tan b--------- t / 2. Hence
2*
EXAMPLE 5.3
Evaluate e* dx.
Solution:
5. Improper Integrals 21
Let a oo. Then ea- - - - - 0. Hence
rs
fJ *
ex dx = lim (es ea) = e3.
: eV
The following integral arises in various applications such as electrical
circuits, heat conduction, and vibrating membranes:
/ e~8Xf (x)dx.
Jo
It is called the Laplace Transform of f (x ) .
EXAMPLE 5.4
f x
Evaluate / e- * cos x dx,
Jo
s > 0.
Solution: From integral tables or integration by parts,
/*& S18X b
/ e-*'
Jo
~3Xcos x dx = ( s cos x + sin x)
?2+ 1
e~8b . s
( s cos 6 + sin b) +
s2+ 1
+ 1
Now let b- - - - - >oo:
cos x dx
[oo rb
/ e~8Xcos x dx = lim / cos a: dx = 0 +
^0 6->00 ^0 + 1
Answer;
EXAMPLE 5.5
f
Evaluate /
/ oo
dx
3e2x + a"2*
Solution: By definition, the value of this integral is
f d x f dx
J 0 3e2x+ e~2x + J_ 3e2x+ e"2*
provided both improper integrals converge. From integral tables,
rb
dx
' V 3 )
22 J. INFINITE SERIES AND INTEGRALS
= [arc tan(e26\ / 3) arc tan V^]-
Now arc tan(e26- \ / 3)- - - - - >tt/2 as 6- - - - - * oo. Note that arc tan \ / S = 7r/ 3.
Hence
f *3 dx . f 6 dx _ 1 T 7i- ?rl
36** + e~2x = J 0 3e2* + 2^/ 3 1.2 3j '
Similarly,
f .-- ^ 7 = lim [ ^
J - x 3e2x+ e~2x 0_MJ a :3e2x+ e~2x
- [arc tan V 5 - arc tan 0] - ^ [| - o] .
Thus both improper integrals converge. The answer is the sum of their values.
Answer:
Rema r k : Do you prefer this snappy calculation?
dx 1
/ - - ~^- arc tan(v^^2x)
3e2* + e"2* 2\/3
1 1 7T 7T
-- ~7=- (arc tan oo arc tan 0) = -7=- = -j =-.
2\/3 2a/3 2 4y/ 3
Warning. Try the same slick method on
f " dx
J - x X2
I t fails. Why?
Evaluate:
EXERCISES
1.
f 00 dx
/ 2 X3
2. I e~x dx
3.
f
/ xe~x dx
h
4.
[ - 1 dx
J - 0 0
5.
[- 1 dx
a
dx
J- 00 1+ x2
0.
J 4 x\ / x
6. Convergence and Divergence 23
xer * 2 dx
/
oo f oo
e-1*1dx 8. /
00 J - 00
f 00 dx f 00
9 Ji W T + x 1 10- Jo e~'Xsi nxdx (s >)
/
r r/rr /*00
"4 ^ (letw = :r2) 12. I xe~BXdx (s>0)
/
OO Too
x2e~9xdx (s > 0) 14. I xne~9xdx (s > 0)
r oo /*oo
15. / e^e~8xdx (s > a) 16. / xe^e*dx (s > a)
/
oo Coo
e~*x cosh x dx (s > 1) 18. / xe- ** sin x dx (s > 0).
Is the area under the curve finite or infinite?
19. 2/ 1/x; from x = 5 to x = oo
20. i/ = 1/x2; from x = 1to x = oo
21. y = sin2x; from x = 0 to x = oo
22. y = (1.001 )*; from X = 0 to x = oo .
Solve for b:
oo f b , [ b dx / dx
23j.e*J, * j, r+?* j. r+^'
25. Find 6 such that 99%of the area under y = e* between x = 0 and z = oo is
contained between x = 0 and x = b.
Denote the Laplace Transform of / (#) by L(/ ) (s), so
L(f )(s ) = f e- *f (x) dx.
26*. Suppose / (x) is continuous for x > 0, and for some n and some constant c we
have \ f (x)\ < cxn for all x sufficiently large. Prove that L(f )(s) exists for all
s > 0.
27*. Suppose / has a continuous derivative / ' for x > 0 and |/ ' (x)| < cxnfor x suffi
ciently large. Prove for s > 0 that
L(f ')(s) = - f (0) + sL(f )(s).
28*. For the / in Ex. 26, set g(x) = / f (t) dt. Prove for s > 0 that
L ( g ) ( s ) = - L ( f ) ( s ) .
6. CONVERGENCE AND DIVERGENCE
Whether an improper integral converges or diverges may be a subtle
matter. The following example illustrates this.
24 1. INFINITE SERIES AND INTEGRALS
EXAMPLE 6.1
f 00dx
For which positive numbers p does the integral / converge?
diverge? ^ ,7?
Solution: Suppose p ^ 1. Then
[ hdx 1 , 1 6 _____\ _ / ____1 _ \
J 1 Xv p 1 ! p 1\ 6p-1/
As b
and
Hence
bp- i
ftp-1
->0 if p 1 > 0,
->oo if p 1 < 0.
P
exists if p > 1, does not exist if p < 1. That means the given integral con
verges if p > 1, diverges if p < 1.
If p = 1,
I
dx
= In 6- oo as b
the integral diverges.
Answer: Converges if p > 1, diverges if p < 1.
Rema r k 1. Obviously the same is true of the integral
[*d x
J a X P
for any positive number a.
Rema r k 2: Now, a subtle question. Why should this integral converge
if p > 1but diverge if p < 1? (See Fig. 6.1.) The curves y = \ / xv all decrease .
as x increases. The key is in their rate of decrease. If p < 1, the curve de
creases slowly enough that the shaded area (Fig. 6.1a) increases without
bound as b--------- oo. I f p > 1, the curve decreases fast enough that the
shaded area (Fig. 6.1b) is bounded by a fixed number, no matter how large b is.
6. Convergence and Divergence 25
F i g . 6. 1a
Convergence Cri teri a
We know that the integral
00 d r .
(a > 0)
converges. Suppose that 0 < g(x) < 1/ x2. Then
g (x) dx
fJ a
/ '
J a
also converges because the area under the curve y = g (x ) is even smaller than
the area under y = 1/ x2. This illustrates a general principle:
If/ (#) ^ OandifO < g(x) < f (x )f o r a < x < oo, then the convergence of
/' oo roo
/ f (x ) dx implies the convergence of / g(x)dx.
J a J a
Proof : We imitate the proof of the comparison test for positive series.
26 1. INFINITE SERIES AND INTEGRALS
Let
F Q>) = f / (*) dx, G(b) = f g(x) dx.
J a J a
Since f (x) > 0 and g(x) > 0, both F(b) and G(b) are increasing functions
of b. Also g(x) <f (x ), so G(b) < F(b). By hypothesis lim^oo F (b) =
Too r oo
/ f (x) dx exists, hence F(b) < / f (x) dx. It follows that G(b) < F(b) <
J a J a
f (x) dx, so limt^oo G(b) exists, that is,
f.
/
J a
g(x)dx converges.
EXAMPLE 6.2
Show that the integrals converge:
f * dx Z*00 sin2x
(a) i , ? T v i ; <b) J , (c) J , ~ d I -
Solution: Note that
1 1
< I , 1 < X < oo,
x2+ y/ x x2
e - x
< e~x, 0 < x < oo,
x + 1
sin2x 1
x x
Since the integrals
dx [ dx
j , * i '**- ), ?
all converge, the given integrals converge by the preceding test.
A second important convergence criterion is this:
Suppose f (x) > 0 and g(x) is bounded, i.e., \ g(x)\ < M for some constant
M. Then the convergence of
r oo r
/ f (x)dx implies the convergence of / f (x)g(x)dx.
J a Jo,
6. Convergence and Divergence 27
Proof : Let
F (b) = f f (x)dx, H (b) I f (x) g(x) dx.
J a J a
We are given that lim*,.** F(b) exists, and we must show that lim*.** H(b)
exists.
By the Cauchy criterion, given e > 0, there exists B such that
- f J a
IF(c) F(b)I < whenever c > b > B.
M ~
It follows that
|H(c) - H(b)| =
i;
r.
f (x)g(x)dx\ < / |/ (*)| |flr(ar)| dx
< M f (x ) dx = M [F{c) - F(b )J < M- j- = e
J b M
whenever c > b > B. Therefore by the Cauchy criterion, limb-oo H(b) exists,
that is,
/ " J a
f (x )g{x )dx converges.
EXAMPLE 6.3
Show that the integrals converge:
f 0
In x
(a) / r xsm 8xdx, b / dx.
J o
J i Xs
Solution: Apply the above criterion,
(a) Since
/ e~x
J o
dx
converges and |sin3x\ < 1, the given integral converges,
(b) Write
In x _ 1 In x
x3 x2 x
The integral
f >dx
J i x2
28 1. INFINITE SERIES AND INTEGRALS
converges and (\ nx)/ x is bounded. [The maximum of (In x)/ x is l/e.~\
Hence the given integral converges.
R e m a r k : Both convergence criteria apply also to improper integrals
of the forms
f
J a
f (x)dx and
/ J a
f i x ) dx.
A Divergence Criterion
Here is a simple criterion for divergence of an improper integral.
If f (x) >0 and if g (x) >f (x ) for a < x < oo, then the divergence of
/ f {x) dx
J a
implies the divergence of
/ gix)dx.
J a
Proof : This time
Gib) = f gix)dx > I f i x )dx = F(b)
J a J a
and F(b) is unbounded, so G(b) is unbounded. Hence
g(x)dx diverges.
/ " J a
This criterion is obvious geometrically; since g(x) >/ (), the region under
y = g(x) contains the region under y = f (x). If the second region has infinite
area, so does the first.
EXAMPLE 6.4
Show that the integrals diverge:
(a)
[ V x ,.s [ x ___dx_
J i 1 + x J 2 -y/x - v'z
f 00Inx
/ dx.
6. Convergence and Divergence 29
Solution: Note that
1 1
>
\/x y / x y / x
1 < X < 00,
2 < x < oo,
In x ^ In 3 1
-- > - - - > - , 3 < x < oo.
X X X
Since the integrals
dx
I i s / x J 3
all diverge, the given integrals diverge by the preceding criterion.
dx f dx f 00dx
J x 1 + x J2 y / x Js x
EXERCISES
Does the integral converge or diverge?
h dx
Jo J + l
+ X2
cosh x dx
/ ' T
7 f sin x
7-00 i + v
, [ '
J i \/^ (* + 4)
/- z3 ,
J ,
f * f jj.
13. Show that / ^--
y 2 (In a;
9.
11
X2+
4. [ e~x*dx
, x dx
6,
10
12
2J:
/
yo V52~+~3
f 00
. / sin a; dx
h dx
J 2 In x
/ " i
)3
Use the substitution w= In x.]
dx
+ x + ex*
converges if p > 1, diverges if p < 1.
14. Show that
f;
converges if p > 1, diverges if p < 1.
x In x[ln(ln x)]p
15. Denote by the infinite region under y = 1/x to the right of x = 1. Suppose #
is rotated around the x- axis, forming an infinitely long horn. Show that the volume
30 1. INFINITE SERIES AND INTEGRALS
of this horn is finite. Its surface area, however, is infinite (the surface area is
certainly larger than the area of R). Here is an apparent paradox: You can fill the
horn with paint, but you cannot paint it. Where is the fallacy?
Find all values of s for which the integral converges:
r oo r oo gsx
16. / e~sxex dx 17. / .5dx
Jo Jo 1+ *2
f 00 r 00 x8
18. / e~8xe~x2dx 19. / , dx.
Jo J 1 (1 + x*)
7. RELATION TO INFINITE SERIES
We have already seen a number of similarities between infinite series and
infinite integrals. In this section we discuss a very useful connection between
them which often enables us to establish the convergence or divergence of a
series by studying a related integral. This is important, for usually it is easier
to find the value of an integral than the sum of a series.
Consider the relation between the series
1 1 1
-j- -j- -j-
22 32 ^ n2^
and the convergent integral
/
00dx
x2
(See Fig. 7.1.) The rectangles shown in Fig. 7.1 have areas 1/ 22, 1/ 32, .
Obviously the sum of these areas is finite, being less than the finite area
under the curve. Hence, the series converges. This illustrates a general prin
ciple:
Fi g . 7.1 Note: x scale is 1/4 of y scale.
7. Relation to Infi nit e Series 31
Suppose/ (#) is a positive decreasing function. Then the series
/ ( l ) + / ( 2) + + / ( ) +
converges if the integral
J ^ f (x ) dx
converges, and diverges if the integral diverges.
Proof : The argument given above for f (x ) = 1/ x2holds for any positive
decreasing function f (x). Figure 7.1 indicates that
/(2) +/(3) + + /() < J " f (x ) dx.
If the infinite integral converges, then
* . = / ( ! ) + / ( 2) + +/ ( ) < /(I) + J " f ( x ) d x </ ( l ) + J f (x)dx.
Hence the increasing partial sums are bounded; the series converges.
If the infinite integral diverges, the rectangles are drawn above the curve
(Fig. 7.2). Their areas are/ (l), / (2), . This time
j;
* / (!) +/ (2) + +/ ()
> - r
f (x ) dx.
But the integrals on the right are unbounded. Hence the increasing sequence
{sn}is unbounded; the series diverges.
V =/ (*)
1 2 3 4
Fi g. 7.2 The rectangular sum exceeds the integral.
32 1. INFINITE SERIES AND INTEGRALS
EXAMPLE 7.1
For which positive numbers p does the series
converge? diverge?
Solution: The series can be written
/ (l ) +/ ( 2) + . . . +/ ( n) + ...,
where f (x) = \ / xv, a positive decreasing function. By the preceding principle,
the given series converges or diverges as
Convergence of Integrals
Sometimes we can turn the tables and use the convergence of a series to
establish the convergence of an integral. If the integrand changes sign regu
larly, we may be able to compare the integral with an alternating series.
EXAMPLE 7.2
Solution: First sketch the graph of y = (sinjc)/ rc. See Fig. 7.3. There
converges or diverges.
Answer: Converges if p > 1, diverges if p < 1.
is no trouble at x = 0 because (sin x)/ x 1as x 0.
y
F i g 7.3 graph of y = (sin x)/ x
7. Relation to Infi nit e Series 33
The figure suggests that the integral is given by an alternating series. To
be precise, let
(n-fl)TT ,
an = (- 1 )" /
J nil
sin x
dx.
Then an > 0; in fact anis the area of the n- th shaded region in Fig. 7.3. Now
/(w-hDir |sin /("+!)* i T i
an = / - - - - dx < I - dx < = - ,
in, X J nw X mr n
so an- - - - - >0. Furthermore
f wsin (x + mr) [*
dn = ( 1)W/ -- - - - - - - ~dx = /
J 0 x + mr J 0
so an > an+1because
sin x
x -\- mr
dx,
sin x sin x
>
x + mr x + (n + 1)t
Therefore the alternating series
dOdi + d2 ~ ds +
converges. But
0 < X < 7T.
do di + a% + (
so we have the existence of
f (n+1)
- l ) nd n = /
Jo
sin x
dx,
r ( n + l ) r g i n x
lim / --- dx,
n-+oo J o
where n can take only integer values.
If b is any positive real number, there is an integer n such that mr <b <
(n + l)7r. Then
f bsin x f (n+i)v sjn x r(
/ = / ~ /
J 0 x J o J b
dx.
If b- - - - - oo, then n - - - - - oo, so the first term on the right converges to a
limit. The second term approaches 0 because
1 f (n+1,Tsinx f (n+1)ir sin x 7
/ dx < / a# dn
1J b %
0.
34 1. INFINITE SERIES AND INTEGRALS
Therefore
converges.
[* sin x _ f bsin # ,
/ ------ a#= lim / ----- da:
Jo 6 ^o c y 0 s
EXERCISES
Does the series converge or diverge?
L 1 + 7 i + 75 + ^ + -
2- - + ^+T + T +"-
e e2 e3 e4
3 1+ ^+ 3 + ^H- - -
4. , ..7 f +o r sk + 1 ' 1
1+ y/\ 2 -f- \/2 3 -f- \/3 4 -f- \/4
5 - ^ + ^ + 1 '
2 In 2 3 In 3 4 In 4
6- ^ + 1 ' 1 O/l-~O\rt 1
2 (In 2 )p 1 3 (In 3 )p 1 4(ln4)*>
7 . ^ + ^ + 1
1+ l 2 1+ 22 ' 1+ 32 '
L j . 3 5 7
l 2 22 32 42
9. Show geometrically that the sum o fl + ^+ ;jj2l- ^2+ *' is less than 2. See
Fig. 7.1. (It is known that the exact sum is 7r2/6, a startling fact.)
10* Use the method of inscribing and circumscribing rectangles to show that
l n(+ 1)< 1+ ^+ 5+ + ~< 1+ lnn.
Z O 71
Is 1+ ^+ ^+ + more or less than 10?
Z o 1UUU
11. Estimate how many terms of the series l + ^+ ^+ 7+ ***must be added be
fore the sum exceeds 1000.
Does the series converge or diverge?
8. Other Improper Integrals 35
14
16*. Prove convergent:
n =1 n =2
OO 00
I.
\ f n In r i
n=1 n=2
sin _
7=- ax
V X
17*. (cont.) Prove convergent:
sin x2dx.
[Hint: Set x2= u. ]
18*. Let 0 < a < b. Prove convergent:
rcos ax cos bx
f:
dx.
[Hint: Separate the difficulties at 0 and 00 .]
19*. Let 0 < a < b. Prove convergent:
1arc tan bx arc tan ax
Jr
dx.
8. OTHER IMPROPER INTEGRALS
A definite integral
L
b
f (x) dx, a and b finite,
is called improper if f (x) blows up at one or more points in the interval
a < x < b. Examples are
ri dx f s dx f 5 dx f 1
Jo J 1 *2 J 6
ln(ir 5)
The first integrand blows up at x = 0, the second at x = 2, the third at
x = 6. Such bad points are called singularities of the integrand.
We shall discuss integrals
rb
f (x) dx
f J a
where f (x) has exactly one singularity which occurs either at x = a or at
x b. This is the most common case.
36 1. INFINITE SERIES AND INTEGRALS
Consider the integral
P dx
Jo V x
whose integrand has a singularity at x = 0. What meaning can we give to
this integral?
Except at x = 0, the integrand is well- behaved. Hence if h is any positive
number, no matter how small, the integral
[ ' dx
A
' x
makes sense. Its value is easily computed:
dx
fJ h
r = 2y/ x
V *
2 (V3 - Vh)-
It is reasonable to def ine
f 3dx f s dx
/ 7== lim / 7=. = 2a/ 3.
Jo V x h^, J h V x
Next, consider the integral
f 3dx
Jo
We try to sneak up on the integral as before by computing
= In 3 In ft,
J h z
then letting h- - - - - >0. But In h- - - - - >oo as h - - - - - 0. Hence
;dx
fJ h
oo as h - - - - - >0.
x
There is no reasonable value for this integral.
Motivated by these examples, we make the following definitions:
Suppose/ (#) has one singularity, at x = a, and that a < b. Define
f f (x) dx = lim f f (x) dx, (h > 0)
J a h-+0 J a+h
provided the limit exists. If it does, the improper integral converges,
otherwise, it diverges.
Similarly, if f (x) has one singularity, at x = b, define
rb rb- h
/ f (x) dx = lim / f (x) dx j (h > 0)
J a h-+0 J a
provided the limit exists.
8. Other Improper Integrals 37
EXAMPLE 8.1
f dx
For which positive numbers p does the improper integral /
converge? diverge? *^ xP
Solution: The case p = 1 was just discussed; the integral diverges.
Now assume p 9^1. By definition, the value of the integral is
lim
h-*0
provided the limit exists. Now
1dx 1 1
f 3dx
im / ,
^0 Jh %p
fJ h
xp p 1xp~l
But, as h - - - - - 0,
1
v
h p - 1 W - 1 3P-1/
hp~l
and
hp~l
->0 if p 1 < 0,
00 if p 1 > 0.
Hence the limit exists only if p < 1. In that case
f S - lim r ^
J o %p J h xp (p ~ l ) 3p_1
Answer: Converges if p < 1,
diverges if p > 1.
R e m a r k 1: The answer applies as well to
f bdx
J o *p
for each positive number b since the upper limit plays no essential part in the
discussion. Only the behavior of l/ xpin the immediate neighborhood of x = 0
counts.
R e m a r k 2: If p > 1, the curve y l/ xp increases so fast as x - - - - - >0
that the area of the shaded region (Fig. 8.1a) tends to infinity. If p < 1,
the curve rises so slowly that the area of the shaded region (Fig. 8.1b) is
bounded.
C a u t i o n : D o not confuse these results with those of Example 6.1 con
cerning
f *dx
J 1
38 J. INFINITE SERIES AND INTEGRALS
Fig. 8.1 graph of y = l / xp
In fact,
f 1dx
J o xp
converges
diverges
if p < 1, I*00dx Jdiverges if p < 1,
if P > 1, J i xp
[converges if p > 1.
EXAMPLE 8.2
For which positive numbers p does the improper integral
r4 dx
f
J 3
(x - 3)p
converge? diverge?
Solution: Change variable. Let u = x 3. Then
f 4 da: f 1du
J 3 (x - 3)p J0. '
But this integral was discussed above.
Answer: Converges if p < 1, diverges if p > 1.
The techniques of the two preceding examples yield a general fact (for
a and b finite) :
The integrals
f b dx f b dx
J a (s a)p J a (b - x) p
converge if p < 1, diverge if p > 1.
The convergence criteria given in Section 6 have analogues for improper
integrals with finite limits. We state (without proof) just one of these.
Suppose that f ( x) > 0, and that f ( x) has a singularity at x = a or at
x = b. Suppose g(x) is a bounded function. Then the convergence of
8. Ot her Improper Int egral s 39
A Convergence Criterion
f ( x) dx implies the convergence of f ( x) g( x) dx.
'b
EXAMPLE 8.3
Show that the integrals converge:
Solution: Use the preceding criterion.
(a) Let f ( x) = l /v ^and g(%) = cos x.
(b) Let f ( x) = l / y / 2 ~ = ~x and g( x) = l / \ / 2 + x.
EXERCISES
Does the integral converge or diverge?
40 1. INFINITE SERIES AND INTEGRALS
i 5- J ! ' J ^ x d*
COS X
9. SOME DEFINITE INTEGRALS [optional]
Sometimes it is important to know the exact value of an improper integral,
not just that it converges. Certain (improper) definite integrals can be found
exactly by tricks, even though the corresponding indefinite integrals are
difficult, or even impossible to find. The most common type of trick involves a
change of variable.
EXAMPLE 9.1
2
TT
Prove / In sin d dd = -- - In 2.
Jo
2
Solution: Since 0 < sin d < 1 for 0 < d < we have oo <
In sin 0 < 0. The integrand approaches oo as x --------> 0+. To prove con
vergence, note that (2/ir)d < sin 6 < 1 for 0 < d < hence In 6 +
In ( 2/ t ) < In sin d < 0. Therefore the integrand does not change sign and the
integral converges by comparison with the convergent integral
r i
In d dd.
I.o
Now set
r*/ 2
I = / In sin 6 dd.
Jo
Make the change of variable 6 = a. Then
'dd.
r o r*/2
I = I In cos a da In cos 6 <
J t /2 Jo 't/2
We now have two expressions for I ; average them:
fr/2
(In sin 6 + In cos 6) dd = i r
2 Jo
1 f r/2 1 ( sin 26\ ,
= - J In (sin 6 cos 6) dd = - J In ^- j dd
v/2
(In sin 2d In 2)
. , it In 2
sin 2d d d ------ - .
But
[*12 j [ r
/ In sin 26 dd = - I In sin 6 dd
Jo ^ Jo
1 f T/2 . 1 f r
= - / In sin 6 dd + - / In sin d dd
^ Jo ^yx/2
1 1 f *
= - / + - / In sin d dd.
% ->Jir/2
Therefore
7r In 2
9. Some Def i ni t e Int egral s 41
1 I f
/ = - / + - / In sin 8 dd
4 4 7t/2
I t is obvious by symmetry (or by the transformation 8 = it a) that
/
[ r/2
In sin d dd = I In sin d dd I,
12 Jo
hence we have
1 1 7TIn 2
and the formula follows.
R e m a r k : I t is known that the indefinite integral
In sin d dd
/
cannot be expressed in terms of elementary functions (composite functions
built with rational functions, radicals, exponentials, logs, and trigonometric
functions).
I n the next two examples we compute by tricks an improper integral whose
corresponding indefinite integral can be worked out, but is complicated. Com
pare Exs. 9 and 10.
EXAMPLE 9.2
Prove that
f x dx f 00 x2dx
J 0 X* + 1 Jo x4 + 1*
Solution: Convergence is obvious. The problem suggests a change of
variables. Try x = 1/u. Then
dx du/ u2 u2du
x4 + 1 (1/u4) + 1 w4 + 1
42 1. INFINITE SERIES AND INTEGRALS
Hence, if 0 < a < 6,
rb dx
Let a *0 and then let b
1/6 u2du _ f 1,a u2du
1/o U*~+1 = J l/b u4 + l
oo. Then 1/a-------->ooand 1/6
P da; _ _ f 1/b u2du _ f 1,a
Ja +1 Jl/aUA+I J l/b
so the stated formula follows.
EXAMPLE 9.3
; : / ;;
Prove that
f dx f K x2dx i r\/2
J0 x* + 1 J0 x4 + 1 4
Solution: The two integrals are equal by the previous example. This
suggests averaging to increase the symmetry:
f 00 dx I / / * 00 & \ _ i p . r 2+ i
J 0 x4 + l ~ 2 \ J 0 x4 + i + J 0 x 4 + l ) ~ 2 J 0 x4 + I
Now we need a really clever change of variable. Consider
u = x ---- .
x
Clearly u oo as
du
-0+ and w oo as .r oo. Also
L _ + 1
dx .r2 x2 ^
so u = w($) is a strictly increasing function taking the interval (0, oo) onto (oo, oo). Furthermore, 1 1 x4 -|- 1 u2 = x2 - 2 + , u2 + 2 = X 2 + = ---- -, x2 x2 x2 hence du (x2 + \ ) / x 2 . x2 + 1 ax = :..dx; u2 + 2 (#4+ l ) / x 2 x4 + 1 1f * +1 [ du V 2 fM dt 2 J 0 *<+ 1 * ~ J ^ u > + 2 ~ 2 J _ x t* + 1 V 2 arc tan 7 I " \ / 2 EXERCISES 10. St i r l i ng' s Formul a 43 Prove: f * 71-2 f i r f ir /2 f i r f ir/2 f ir/2 1. / x In sin x dx = In 2 [Hint: / = / + / = / + / (?).] > 2 > > >/2 > > 2i V/2 In tan x dx 0 fx/ 2 3. / sin x (In sin x) dx = In 2 1 [Hint: Integrate by parts.] dx 7r f: 4' Jo (1 + x ) Vx 5. J (In x) ndx = (l )nw! for n = 1,2,3, / i 3 Too x ln(l x)dx = - 7. I r dx _ f 00 x dx _ 27r\/3 8- Jo xM7! " Jo ^>+1 9 9. Start with x4 + 1= (^2 + V2#+ 1) (z2 \ / 2 x + 1) and obtain the partial fraction decomposition 1 _ j \ / 2 x + 2 i \ \ / 2 x + | dx _ 7T cosh x 2 + x4 + 1 x2 + y/*2x + 1 x2 \ / 2 x + 1 * 10*. (cont.) Use this to obtain 7T\/2 [ dx _ f b dx J - ooX4 + 1 &-oo J-b X4 + 1 (Compare Example 9.3.) 10. STIRLING S FORMULA [optional] I n this section we obtain several useful and important results in analysis by exploiting the method of approximating integrals by sums. Eulers Constant We know that the harmonic series 2] 1/n diverges. Now we show much more: that the partial sum sn is approximately I nn. Consider Fig. 10.1a. Comparing the area under the curve y = l / x between 1 and n with a sum of rectangles, we see that 1 1 f ndx 1 + - + + - = I -----b cn In n + cn) 2 n 1 Ji where cn is the area of the shaded regions. 44 1. INFINITE SERIES AND INTEGRALS (b) A close-up of the error; it fits into the square of side one. Fi g . 10.1 approximation of sn by an integral Now shift the shaded areas to the left as in Fig. 10.1b. They fit inside a square of side one! I t follows that {c} is an increasing sequence bounded above by 1. Hence limn.**, cn = y exists, and we have 1 + i + + - - l n n = - + cn--------* 7 - 2 n n There is a positive constant y such that n t o ( i - I n n ) - * . k =1 The number y is called Euler's constant. I ts value is approximately 0.57721566. I n view of Fig. 10.1b, this value seems reasonable. For the curve y = 1/x is convex, so the combined shaded areas fill out a bit more than half of the square. R ema rk : I n many computations, the exact value of y is not needed. What counts is that the difference between XI ^/n and In n is bounded, 10. St i r l i ng' s Formul a 45 so for large values of n, the approximation l / k ~ In n is often accurate enough. Wal liss Product This is an old and remarkable formula: Wal l i s s Product Hm 1 f 2-4-6--(2n) T _ * n 2n + 1Ll -3-5---(2n - 1)J ~ 2 I ts derivation is an exercise in integration. Set //2 rir/2 J n = / cosn0 dd = / cosn_1 J 0 Jo 0 cos 0 dd. Integrate by parts with u = cos71-*1 0 and v = cos 0: r*i2 J f* n = 0 + / (n 1) cosn_2 0 sin2 0 dd. J o But hence Solve for J n: cosn_2 0 sin2 0 = cosn2 0(1 cos2 0) = cosn_2 0 cosn0, Jn = (n 1 ) (/n2 Jn) j w~ 1 , n / n2 Now apply this reduction formula repeatedly and eventually reach either r */2 Jo = I dd = - or Ji = I cos 0 d0 = 1. Clearly there are two cases, n even and n odd. The results are I .3 .5 ... (2n 1) tt fr/2 i - / , Jo J 2n 2-4-6---(2n) 2 2-4-6---(2ra) 1-3-5---(2ra + 1) ' Since 0 < cos 6 < 1 for 0 < 6 < %ir, we have cos2"-1 6 > cos2nd > cos2n+16, 46 1. INFINITE SERIES AND INTEGRALS hence J 2n-i > Jin > Jin+i- Substitute, then rearrange: 2-4-.-(2n ~ 2) . 1-3- (2w - 1) w > 2-4- (2 n) 1*3* (2n 1) 2*4* (2n) 2 1 -3- (2n + 1) l~2*4-6- (2n 2)12 t f 2.4- 6...(2n) I 2 1 LI *3 5 (2n 1)J 2 L l -3-5-(2n - 1)J 2n + l ' For simplicity, introduce the quantities 1 f 2- 4.6...(2n) I 2 " 2n + 1 Ll . 3. 5. (2n - 1)J Now divide the last inequality by Hn : 2n -|- 1 ^ 7r ^ 2n > 2Wn >L I t follows easily that x/2Hn-------->1, hence Hn--------> r as n -------->oo. Done. Let us express Hn in terms of factorials. The product of evens is easy: 2-4-6---(2n) = 2n( l -2-3- - -n) = 2w(n!). To get the product of odds, throw in the missing evens and compensate by dividing them right out again: _ 1 -2-3-4 - - - (2n 1) (2n) (2n) \ 1-3.0.. . ( 2 1) - 2 .4 .6 . . . (2) 2-! I t follows that 1 p2"(n!)2T Hn ~ 2n + 1 L (2n)! J and that Wallis's formula can be expressed in the form: 1 r22n(n!)2T 7r ! Z 2 n+ 1L (2n)\ J = 2* There are a number of interesting applications of this formula. Here is one concerning probability. Suppose a coin is tossed 2n times, where n is large. Then the number of heads that can be expected is about n. Yet it seems unlikely that exactly n heads will appear. J ust what is the probability of this event? Let p n be the probability of n heads in 2n tosses. Then 10. St i r l i ng' s Formul a 47 Why? Because the probability that a given sequence of 2n heads and tails occurs is 2~2n, and there are (2^) such sequences that contain n heads and n tails. (This is the number of ways to choose n positions for the heads from among 2n possible positions.) Now, Wallis's formula may be rewritten by taking reciprocals: 12 ! ( 2n+1, k = w J - ; ' The quantity in brackets is pn. Hence, for large values of n, (2n + 1 ) p n2 - , 7r that is, \ ( 2n+ l ) i r~ Thus, for example, the probability of exactly 10,000 heads in 20,000 tosses of a coin is pl000 ~ - v/T oW ~ '0564' This is fairly small, roughly 1 in 180. Y et when you realize that there are 20,001 possibilities (no heads, one head, etc.), it is relatively quite large. Stirli ngs Formula The numbers n\ occur frequently in applications of mathematics. They grow rapidly as n increases, and are tedious to compute. Still in many prob lems we need at least an estimate of their size. Such an estimate is provided by the remarkable formula of Stirling: Stirling s Formula n\ t t \ Z2t u nne~n. More precisely, \ \ / 2 nne~nJ Note that Stirling's formula does not give a close estimate in the usual sense, but rather an order of magnitude" estimate. For n large, the difference between n\ and \ / 2i rn nne~n is also large. However, this difference is small relative to n! I n other words, for n large enough, \ / 2 t t u une~n approximates n! to within, say, 1%. But 1% of 100! is a huge number. Proof: I t is easier to work with I nn! than n! itself. Accordingly, let Sn = In n! = In 1 + In 2 + In 3 + + In n. This is just the kind of sum we expect to be related to an integral. Let us con sider a trapezoidal approximation to In x dx. See Fig. 10.2. Comparing the area of the trapezoids shown to the area under the curve, we see that j ^In x dx t t \ In 1+ In 2 + In 3 + + In (n 1) + i In n = Sn h In n. 48 1. INFINITE SERIES AND INTEGRALS Set f n Tn = In x dx. Then we have Tn t t Sn h In n, that is, Sn t t Tn + I In n. We shall show that this approximation is close, more precisely, that the difference of the two quantities approaches a constant. Therefore, we study the difference An = Tn + i I nn - S n- Our strategy is to prove the existence of l i mAn by showing that {A n} is increasing and bounded. This we do in two steps. Then we complete the proof of Stirling's formula in two further steps. 10. St i r l i ng' s Formul a 49 Step 1: To prove Ai < A2 < As < We note that Ak ~ Ak-i (Tk Tk~i) + [[J In k \ In (k 1)] (& Sk-i). But so - tv.i = r J k- 1 In x dx and Sk Sk-i = In kj rk A k ~ Ak-1 = / In x dx [ln k + In (A; 1)]. J k-1 From the trapezoidal approximation in Fig. 10.3a, we see that f In x dx > J [ln k + In (k 1)]. J k- 1 Therefore A k A k~i > 0 ; the sequence increases. (a) trapezoidal approximation (b) tangential approximation Fig. 10.3 estimates of An Step 2: To find an upper bound for {An}. Consider Fig. 10.3b. The tangent at (k, In A;) lies above the convex curve, hence 50 1. INFINITE SERIES AND INTEGRALS for k > 2. By summing, we conclude that f n 1 / 1 1 1\ J In x dx < (In 2 + In 3 + + In n) - + - + + - J , 1 / 1 1 1\ T n < Sn ~ - ( - + - + +- ) . 2\2 3 n f Therefore A n = r , + | l n - 5 . < | [ l n - Q + i + + . But the quantity on the right is bounded, as we have seen in studying Euler's constant. Therefore { An} is bounded above. Step 3: The sequence {^4} is increasing and bounded, hence there is a number C such that A n-------->C, that is, Tn + \ \ n n - Sn--------*C. But so x dx = n In n n + 1, >ec, n In n n + 1 + ^In n In n \ -------->C. Take exponentials: nne~ne y f u n\ that is, n\ (*) ---------7=-------->K, nne~ny / n where K = el~c. Step 4- Complete the proof by showing K = \ Z2r . We exploit Wallis's product, which provides a relation between (2n) \ and (n!)2. From relation (*) above: 10. St i r l i ngs Formul a 51 The quotient approaches K*/ K = K. Divide and simplify: K. [ 2 *-(!)* ] ^L ( 2n)!v^J But the quantity in parentheses has limit as is seen from Wallis's product. Therefore K = \ / 2 t ; the proof of Stirling's formula is complete. EXERCISES 1. Estimate the number of digits in 100!. 2. Estimate the number of bridge hands (13 cards) that can be formed from a 52-card deck. Find the limit: n + 1 n + 2 1 | 1 i +l ' n 2 f - ^ + 6. lim- (n!)1/n. n-+ oo71 + + 2 ' ' 3n2 s ) i) V.) 7*. Use the method of the text to prove the existence of 2. Taylor Approximations 1. INTRODUCTION I n this chapter we shall study approximations of functions by polynomials. Why approximate functions by polynomials? Because values of polynomials can be computed by addition and multiplication, simple operations well-suited for hand or machine computations. Suppose, for example, you need a 6-place table of f ( x ) = e3x2 at 1000 equally spaced values of x between 1 and 1. If possible, find a polynomial p( x) such that 63x2 = e(X), where |e(#)| is less than 5 X 107 for 1 < x < 1. Then program a computer to tabulate the corresponding values of p( x) . We shall discuss methods for finding polynomial approximations and ob tain estimates for the errors in such approximations. 2. POLYNOMIALS We begin with a basic algebraic property of polynomials: every poly nomial can be expressed not only in powers of x, but also in powers of (x a), where a is any number. This form of the polynomial is convenient for computa tions near x = a. EXAMPLE 2.1 Express x2 + x + 2 in powers of x 1. Solution: Set u = x 1. Then x = u + 1, and x2 + x + 2 = (w + 1 )2 + (u + 1) + 2 = (u2 + 2u + 1) + (u + 1) + 2 = u2 + 3u + 4. Answer: (x I )2 + 3(x 1) + 4. 2. Pol ynomi al s 53 EXAMPLE 2.2 Express %z x2 + l l x %in powers of x 2. Solution: Set u = x 2. Then x = u + 2, and #3 6#2 + 11#6 = (u + 2 ) 3 6(u + 2 ) 2 + 11 (u + 2 ) 6 = us u. Answer: (x 2 ) 3 (x 2 ). Re ma r k : The answer reveals a symmetry about the point (2, 0) not evident in the original expression for the polynomial. EXAMPLE 2.3 Express x4 in powers of x + 1. Solution: x4 = [(# + 1) l ]4. Answer: (x + l )4 4(.t + l ) 3 + 6( x + l )2 - 40 + 1) + 1. The methods used in these examples is simple. To express p( x) = A0 + Ai x + A 2x2 + Azx* + + A nxn in powers of x a, write u = x a. Then substitute u + a for x: p( x) = Ao + A\ ( u + a) + A^i u + a) 2 + + A n(u + a)n. Expand each of the powers by the Binomial Formula and collect like powers of u. The result is a polynomial in u = x a, as desired. This method is laborious when the degree of p( x) exceeds three or four. We now discuss a simpler, more systematic method. Suppose p( x) is a polynomial expressed in powers of x a: p( x) = A0 h A\ { x a) + A%(x a) 2 + + A n(x a ) n. What are the coefficients A0, Ai, A 2, ? There is an easy way to compute A0. J ust replace %by a. Then all terms on the right vanish except the first: p( a) = Ao. Now modify this trick to compute A\. Differentiate p( x) : p' { x) = Ai + 2A2(x a) + + nAn{x a)n_1. Substitute x = a; again all terms vanish except the first: p' ( a) = Ai. Differentiate again to find A 2: p" (x) = 2A 2 + 3-2A 3(x a) + + n( n 1) An(x a ) n~2. Substitute x a : p" (a) = 2A2. Once again: p/ f, (x ) = 3-2A z + + n( n - 1) (n - 2) An(x - a)n~s, p' " ( a) = 3-2A z = 3! A z. Continuing in this way yields p(4)(a) =4\ A 4, p (5)(a) = 5!A5, ,p(n)(a) =n! A n. (Here p {k) is the fc-th derivative.) I f p ( x) is a polynomial of degree n and if a is a number, then p( x) = p (a) + p' ( a) ( x - a) + ^ p"( a) ( x - a) 2 + V'" ( a) (x - a)* + + p(n) (a) (x - a ) n. 3! n! EXAMPLE 2.4 Express p( x) = xs x2 + 1 (a) in powers of x ~, (b) in powers of x 10. Jttd Solution: Use the preceding formula with n = 3. Compute three derivatives: p' (x) = Sx2 2x, p" (x) = 6^2, p ,, r(x) = 6. For (a), evaluate at x = J : p(l )=-I- By the formula, +(*-I) For (b), evaluate at x = 10: p(10) = 901, p'(10) = 280, p"( 10) = 58, p'"(10) = 6. By the formula, 54 2. TAYLOR APPROXIMATIONS 2. Pol ynomi al s 55 Answer: (b) 901 + 280(x - 10) + 2 9 (x - 10)2 + (x - 10)3. The next example illustrates the computational advantages gained by ex panding polynomials in powers of x a. EXAMPLE 2.5 Let p( x) ~ xz x2 + 1. Compute p (0.50028) to 5 places. Solution: Use answer (a) of the preceding example: p (0.50028) = p 0 + 0.00028 7 1 1 = - - - (0.00028) + - (0.00028 )2 + (0.00028 )3. O t: JL The last two terms on the right are smaller than 10~7. Therefore to 5 places, p (0.50028) agrees with 7 1 - - - (0.00028) = 0.87500 - 0.00007. Answer: 0.87493. Because we shall write polynomials frequently, it is convenient to use summation notation: n ^ Ai Xx = Ao -f- A\ x + A2x2 + + A nxn. i =0 The formula for an n-th degree polynomial in powers of x a can be ab breviated: P() = V V ^ (x - a ) \ 1=0 t Here p (i) (a) denotes the i-th derivative of p (x) evaluated at x = a, with the special convention p w (a) = p( a) . (Also recall the convention 0 ! = 1.) Expand in powers of x a: 1. x2 + 5x + 2; a = 1 EXERCISES 2. x8- 3x2 + 4x; a = 2 3. 2x3+5x2+13x+10; a = - 1 4. a:4 5x2 + x + 2; o = 2 5. 2x4+5x* +4a:+16; a =- 2 6. 3xs 2x2 2x + 1; a = 1 7. 5a:5 + 4x4 3a:3 2x2 + x + 1; a = 1 8. a:5 + 2a;4 + 3x2 + 4x + 5; a = 2 9. a;4 - 7a:3 + 5x2 + 3x - 6; a = 0. Evaluate to 4 significant digits: 10. x3 3a:2 4- 2x + 1; x = 1.004 11. x6 + a:4 + a;3 + a;2 + a; + 1; x = 1.994 12. 4a:4 - 3a:2 + lOx + 12; x = -0.9890 13. 10a;3 + 12x2 - 6x - 5; x = -3.042. 56. 2. TAYLOR APPROXIMATIONS 3. TAYLOR POLYNOMIALS ( Consider this problem: Given a function f ( x) and a number a, find a polynomial p( x) which approximates f ( x) for values of x near a. One approach that seems reasonable is to construct a polynomial pn{x) of degree n so that P n ( a ) = f ( a ) , pn' ( a) = f ' ( a) , pn" (a) = f " (a), , p(n) (a) = / (B) (a). Thus pn(x) mimics f ( x) and its first n derivatives at x = a. Let us find p n(x) explicitly. We write p n(x) = A o + Ai ( x a) + A 2 (x a ) 2 + + A n(x a ) n and choose the coefficients A k appropriately. But in the last section, we saw that Ak = p n(k) ( a) / k\ . Since we want pn(k) (a) = / (A;)(a), we must choose Ak = /<*>(a)/fc! The n-th degree Taylor polynomial of f ( x) at x = a is P n ( x ) = / (a) +/'(a)(x - a) + ^f "{ a ) ( x - a)2 + + ~;/(n)(a)(* - a)", w! When/(#) is itself a polynomial of degree n, then Vn(x) = /(s). This was shown in the last section; pn(x) is precisely the expression for f ( x ) * in powers of x a. Furthermore, in this case, Pn(x) = Pn+l(x) = pn+2 ( x) = . (Why?) Thus for an n-th degree polynomial f ( x) , the n-th degree and all higher Taylor polynomials equal f ( x) . Here are the first three Taylor polynomials explicitly: pi ( x) = f ( a) + f ( a ) (x - a), P2O) =/() + f ( a ) ( x - a) + ^ f ' ( a ) ( x - a) 2, Ps(x) = / (a) +/ ' ( <*) (x - a) + - a)2+ ^f "( a ) ( x - a) 3. The graph of the linear function y = Pi(x) is the tangent to the graph of V = f ( x) at (a, /(a)). The graph of the quadratic function y = p2(x) is a parabola through (a, /(a)), also with tangent y = pi ( x) , and curved in the same direction as y = f ( x) . See Fig. 3.1. 3. Tayl or Pol ynomi al s 57 I n general, each Taylor polynomial is derived from the preceding one by the addition of a single term: Pn+i(x) = pn(x) + / (n+1)( a) ( x - a) n+l. (n + 1)! We anticipate that pn(x) is a good approximation to f ( x) ; the error is f ( x) pn(x). We try to reduce this error by adding an additional term / (n+1)(a)(z a)n+1/(n + 1)! to pn(x), thereby obtaining p n+i (x), an even better approximation (we hope). I n Section 6 we shall justify the fol lowing formula for the error: 58 2. TAYLOR APPROXIMATIONS Tayl or s Formula with Remainder Suppose/^) has derivatives up to and including /<n+1) ( x) near x = a. Write / ( ) = Pn( x) + r n( x) , where pn( x) is the n-th degree Taylor polynomial at x = a and r n ( x) is the remainder (or error). Then Usually the integral expressing r n( x) cannot be computed exactly. Never theless, the integral can be estimated. Here is one important estimate: Estimate of Remai nder Suppose f i x ) = Pn( x) + rn(x), where p n( x) is the n-th Taylor polynomial at x = a. If |/ (w+1)0c)| < M in some interval including x = a, say b < x < c, then This assertion is verified by a direct estimate of the integral: EXAMPLE 3.1 Find the n-th degree Taylor polynomial of f ( x ) = ex at x = 0. Estimate the error. Solution: /( 0 ) = 1, / '( 0 ) = 1, /" (0 ) = 1, Hence n n The (ft + l )-th derivative is / (n+1)(0 = e*. If x > 0, the largest value of f(n+1) for i between 0 and x is ex. By the remainder estimate with M = ex, \rn(x)\ < ^ ^ xn+1 for X > 0. If x < 0, the largest value of / (n+1)(0 between 0 and x is e = 1. By the remainder estimate with M 1, Ix Iw~^ |r"(a:)l - ( +! ) ! 3. Tayl or Pol ynomi al s 59 ^.3 Answer: p (*) = 1 + a; + + + + ,, Z\ o! ft! \rn{%)\ < 77; xn+1 for x > 0, (n + 1)! j XI M #)| < 77 -7 7 . for x < 0 . (n + 1)! R e m a r k : At first sight, the answer to Example 3.1 seems circular: the error estimate in approximating ex involves ex itself. However, if the ex in the remainder is replaced by something a little larger, say Sx, we still get a useful estimate of the remainder: 3Z k(s)| < 7 ;77. xn+l for x > 0. (n + 1)! EXAMPLE 3.2 Find the Taylor polynomials for sin x at x = 0. Estimate the remainders. Solution: Compute derivatives: /(#) = sin#, f ' ( x ) = cos#, /"(#) = sin#, /"'(#) = cos#, f w (x) = sin#, , repeating in cycles of four. At #= 0, the values are 0, 1, 0, - 1, 0, 1, 0, - 1, 0, Hence the ft-th degree Taylor polynomial of sin #is /y3 n* 5 /v7 /v9 K . W - . - j i + j j - J j + j i - - , where the last term is dbxn/ n\ if n is odd and =txn~1/ (n 1)! if ft is even. 60 2. TAYLOR APPROXIMATIONS For example, ps (x) = Pi (x) = x - , 5 p5(x) = p(a:) = * - + , /y3 r p o r p l M * ) =M ) + Thus p2mi ( x) = P2m(x) and the sign of the last term is plus if m is odd, minus if m is even: ' X3 X^ P * ^ ( X ) = V * m { x ) = x - - + - ------- + ( - ! ) - * (2w I-)-, m - I I =1 ( - 1) - (2* - 1)! The remainder estimate is easy: |/ (w+1)(jc)| ^ 1 because f (n+1)(x) = zfc sin x or dz cos x. Hence knO)| < In+1 ( n+ 1)! Since p 2 m- i ( %) = P2 m( x ) , it follows that |r2m-l(^)| = k'2m(a0 | < .|2m+l (2m + 1)! Answer: sin#= p2m(%) + r2m(%), P2m (a?) m I ( - D 1 r2ii (2i - 1)! - 1 (2m + 1)1 [Note: pim- i ( x) = pim{x).'} R e m a r k : For the cosine, a similar argument shows n%2 rv* 4 /y6 v2wi cos* = 1 - _ + |r2m(jc)| = |r2m+i(x)| < .12m+2 (2m + 2 )! EXAMPLE 3.3 Find the n-th degree Taylor polynomial of In x at x = 5. Estimate the remainder if 4 < #< 6. 3. Tayl or Pol ynomi al s 61 Solution: Let/(#) = I n#. Then x x2 #3 /((*) = ~ - 4 , , /<)(*) = ~, 1)!. The coefficient of (# 5)* in the Taylor polynomial is I / ( 0 (5) = ( _ i ) i \ 5* i5* Therefore, Pn( x) = In 5 + ^ (x - 5 ) - (x - 5 )2 + (x ~ 5 ) 3 ~ ' * + ( - 1 )"-1 ^ r ( x - 5 ) . 3 5d n 5n An error estimate in the range 4 < #< 6 is M M kn(a?)| < -7 77; \x - 5 |n+1 < - ( l ) n+1, (n+ 1)! where M is a bound for |/(n+1) (#)|. Now (n+ 1)! n! r n+l n! < 7 for 4 < #< 6. 4*1+1 Take this number as M. The resulting estimate is I n ! 1 M * )l < (n + 1)! 4n+1 (n + 1)4W+1 Answer: For 4 < x < 6, n P n ( x ) = I n 5 + ^ (* ~ 5 ) i> t l 1 1 (n + l )4n+1 * Summary T a y l o r 's F o r m u l a w i t h R e m a i n d e r : f ( x ) = Pn( x) + r n( x) , Pn{ x) = ^ (X ~ a )'> r"(a;) = ~ \ f ~ 0 n/ <n+1> ( 0 t =0 0 62 2. TAYLOR APPROXIMATIONS E s t i m a t e o f t h e R e m a i n d e r : If |/(w+1) (01 < M for all t between a and x, then M*)| < I* - S p e c i a l F u n c t i o n s : n ex = ) + rn(x), \rn(x)\ < - i =0 exx ( + 1)! (n + 1)! if x > 0 if x < 0 . 1 V :------ = > ** + 7 1 x L-! 1 x= 0 .n-fl if X 9 ^ 1 . m YA (l )*-1 sin a; = > - ; a;2*-1 + r2m_i(a;), |r2m_i(a;)| < 1=1 .|2m+l (2m + 1)! cos m L , ()! i =0 x2i + r2m(x), Vim{x)\ < . 12m+2 (2m + 2 )! EXERCISES Find the Taylor polynomials at the given point; estimate the remainder: 1. f i x ) = sin 2x; x = 0 3. f i x ) = xex; x = 0 5. f { x) = x2lnx; x = 1 7. f i x ) = xV"*; x = 0 9. f i x ) = x sin x; x 0 11. f i x ) = sinx + cosx; x =0 13. f i x ) = sinh x + sin x; x = 0 2. f i x ) = sin2x; x = t t / 2 4. f i x ) = xex; x = 1 6. /(x) = x2 In x; x = e 8. f i x ) = x2e~x; x = 1 10. /(#) = xsinx; x = 7r/2 12. f i x ) = coshx; x = 0 14. /(:r) = 1+ ex + e2x; x = 0. 15. Let pOr) = x44x3+ 6x23x + 2. Estimate the error in the range f < x < f if this polynomial is approximated by its linear Taylor polynomial about x = 1. 16. Compare the Taylor polynomials for sin x at x = 7r/2 with those for cos x at x = 0. 17. Let Pni x) be the n-th degree Taylor polynomial of f i x ) at x = a. Verify that Pn (a) = / (i) i a), for i = 0, 1, 2, , n. 4. Appl i cati ons 63 4. APPLICATIONS Taylor's Formula with Remainder provides a practical method for ap proximating functions. First, it gives a simple procedure for obtaining a polynomial approximation. Second, it supplies an estimate of the error. EXAMPLE 4.1 Find a polynomial p( x) such that \e* p{ x) \ < 0.001 (a) for all x in the interval ! <#<! , (b) for all x in the interval 2 < x < 2 . Solution: By the answer to Example 3.1, a logical choice for p( x) is one of the Taylor polynomials 7*2 ry>3 fy%n pn(x) = l + x + - + - +. . . + - 2! 3! n\ We want to choose n so that \ex p n{x)| < 10~3. To minimize computation and round-off error, we prefer n as small as possible. (a) If J < x < j , then e1/2 / l \ n+1 M ' ) I S ( +1) ! \ 2. We choose n so that pW (n + that is, +i i < 1 0 0 0 1 1 1 < < (n + l )!2n+1 1000e1/2 1648 A few trials show _1___ J _ J _ _ JL_ 4!24 384 J 5!25 ~ 3840* Hence we take n + 1 = 5, that is, n = 4. The desired polynomial is . . X2 X3 X4 M x ) + - + _ + (b) If 2 < x < 2, then 64 2. TA YLOR APPROXIMA TIONS This time we choose n so that (ft + 1)! that is, (ft + 1)! 7389' A few trials show 210 1024 4 1 ____ _______________ - ___________ _____ 10! 3,628,800 14,175 ~ 3544 211 210 2 2 1 11! 10! *11 ~ 38,981 ~ 19,500 Hence we take ft + 1 = 11, that is, n = 10. The desired polynomial is method is approximation by Taylor polynomials. These should be of low degree (few terms) to limit the number of arithmetic operations necessary and to prevent accumulation of round-off errors. Note that we need tabulate sin # and cos x only for 0 < x < 7r/4. (Why?) Let us concentrate on sin#; a similar discussion applies to cos#. First we consider the third degree Taylor polynomial at #= 0, #2 #3 #10 Tables of Sines and Cosines Suppose we want to construct 5-place tables of sin #and cos #. A logical From Section 3 For 5-place accuracy, the error must be less than 5 X 10~6. Therefore we want 4. Appl i cati ons 65 An easy computation with slide rule (or 4-place log tables) shows this is the case if x < 0.22 rad 12.5. Thus up to about 12, the third degree Taylor polynomial yields 5-place accuracy. Next we try Since X1 |sina: - p5(a:)| < , 5-place accuracy is obtained if x7 - < 5 X 10-6, #7 < 0.0252. 7! Using the C.R.C. table of seventh powers, we find this is the case provided x < 0.59 rad t t 34. Next we try The error is /y3 /y5 /y*7 X |sina: - pT(*)| < ^ . For angles up to 45, this error is at most ifi ( i ) ' h (0-7854) < j j j <-79> - The C.R.C. table of ninth powers shows (79)9 is slightly less than 12 X 1016. Hence (0.79)9 is slightly less than 12 X 10-2. The C.R.C. tables show also that 1/9! 0.276 X 10~5. Therefore |error| < (12 X 10~2)(0.28 X 10~5) < 4 X 107. Conclusion: For 0 < x < 7r/4, the Taylor polynomial Vi ( x) yields 6-place accuracy in approximating sin#. Furthermore, since pi ( x) involves only four terms, the probability of large accumulated round-off error is low. Thus pi ( x) provides a practical way of constructing a table of sines. I t is not necessary to limit ourselves to Taylor polynomials at x = 0. For example, if we are interested only in angles close to 45, it is better to use Taylor polynomials at tt/4. They provide greater accuracy for the same amount of computation. 66 2. TAYLOR APPROXIMATIONS The third degree Taylor polynomial of sin x at tt/4 is and I sin x - p3(x)\ < - j ) I f x differs from 45 by at most 0.1 rad 5.7, then the error is bounded by j - (0.1)4 < 4.2 X lO"6. 4! Hence for x between 39.3 and 50.7, ps ( x) yields 5-place accuracy. Between 44 and 46, the error is bounded by (0.0175)4< 4 X lO"9, (1 0.01745 rad) 4! and so ps ( x) yields 8-place accuracy. We see that near tt/4, the Taylor polynomial p s ( x ) about x = 7r/ 4 gives the same accuracy as does p i { x ) about x 0. This is typical of Taylor polynomials: for values near x = a, the accuracy achieved by an n-th degree Taylor polynomial about x = 0 is matched by a lower degree Taylor poly nomial about x = a. Generally the lower degree polynomial means less compu tation and less round-off error. What is not typical is that every other coefficient is zero in the Taylor polynomials of sin x and cos x about x = 0. For computation, this is excellent; it means relatively little computation yields extraordinary accuracy. For example, the polynomial p i ( x ) of sin# actually involves only four terms, yet provides an approximation to within |#|9/9!. This is why the Taylor polynomials about points other than x = 0 are rarely used for sin x and cos x. EXERCISES 1. Find a polynomial p( x) such that \e~x2 p(#)| < 0.001 for all x in the interval - 1 < #< 1. 2. What degree Taylor polynomial about x = 0 is needed to approximate/{ x) = cos x for 7r/4 < x < 7r/4 to 5 decimal places? 3. Estimate the error in approximating f ( x) = ln(l + x) for | < x < | by its 10-th degree Taylor polynomial about x = 0. 4. Approximate f ( x) = 1/(1 x) 2 for j < x < J to 3 decimal places by a Taylor polynomial about x = 0. 5. Approximate sin2x by its 4-th degree Taylor polynomial about x = 0. Estimate the error if \x\ < 0.1. [Hint: sin2x = \ (1 cos 2#).] 6. Show that for 100 < x < 101, the approximation y / x t t 10 + (x ~~ 100) correct to within 0.0002. 5. Tayl or Seri es 67 7. Show that |sinx pg(x)| < 5 X 10 6 for 0 < x < t / 2, where y$( x) is the 9-th
degree Taylor polynomial for sin x at x = 0.
8. Find the smallest positive integer n such that for 0 < x <
1 x
(1 + X + X2 + + x n) < 5 X 10"6.
9. Compute sin(57r/8 ) to 5-place accuracy. [Use Taylor polynomials of sinx, but
not at x = 0.]
10. I f/' (a) = /" (a) = = f^n~~l) (a) = 0, but / (n) (a) ^ 0, show that a reasonable
^(w) \
approximation to/(x) near a is/(x) t t f ( a ) +----: (x a)n.
n\
11. (cont.) Suppose n is even; show that / (x) has a maximum at x = a if / (w) (a) < 0,
and a minimum at x = a if / (n)(a) > 0. Suppose n is odd; show that f i x) has
neither a maximum nor a minimum at x = a.
5. TAYLOR SERIES
Consider once again the Taylor polynomials and remainders of the ex
ponential function at x = 0 :
n
6x = 2/ j \ + T n { x )
i =0
where
Ma;)| <
g(x) |x|n+1 _
( +! ) !
g(x) = 1 for x < 0
g( x) = ex for x > 0.
This estimate shows that no matter what x is, the error is very small if n
is large enough. I n other words, for fixed x,
g( x) H n+l
( +! ) !
Here is the reason. The number x is fixed; set A = \x\. Pick m so that
m > 10A. Then
<
1 1
m + 2 " 10 m + 3 < 10
for any k. If n = m + k, then
|x|n_ A n A m A A
n\ n\ m\ m + 1 m + 2 m + k
Now k ' ->oo as n
A m 1 1 Am 1
^ ml 10 10 m\ 10fc
> oo. Since g ( x) Am/ m\ is fixed, the right-hand
68 2. TAYLOR APPROXIMATIONS
->0 as n * oo. This means that rn( x) term -
for each x, hence the infinite series S/l o x*/i\ converges to ex:
00
e ~ / / VT
for all #.
t =0
Similarly
Such representations of functions by what look like polynomials of infinite
degree, are called Taylor series. A familiar one is the infinite geometric series:
I n general, suppose/(.r) is a function that has derivatives of all orders.
(We say that f ( x) is infinitely differentiable.) Then for each n,
f { x ) = ^ (* - a Y + *(*)
t=0
I f we can show that rn(x)
then we can write
0 as n
' f {i) (fl)
oo for certain values of x}
f i x ) =
i =0
This formula is called the expansion of f ( x) in a Taylor series at x a.
I n practice, it may be tedious to compute successive derivatives of / (x)
and difficult to determine whether rn( x ) --------0. I n Chapter 3, we shall dis
cuss the theory of representing functions by series and shall present various
practical techniques for doing so.
EXERCISES
Find the Taylor series about the given point:
1. f i x) = sin3z; x = 0 2. f(x) = cos^x; x = 0
3. f i x ) = sin2 x ; x = 0
5. f i x ) = e~2x; x = 0
7. / ( * ) = e3*+2; x = -
1
4. f i x ) = cos2(2x 1); x = J
6. f i x ) = cosh x; x = 0
8. f i x ) = sin x + 2 cos x; x = 0
6. Deri vat i on of Tayl or's Formul a 69
9. / ( *) =
1 - Sx
1
10. f ( x) =
1
2 + x
; x = 0
12. An early model desk computer had an exponential pack, but no trigonometric one.
Its program to compute the sine used the approximation
sin x :
2 (e* - e~x) ~ l x i [ 1 + m ( x i + 7 ^ 0 *8) ]
Express the exact error as a power series.
13*. (cont.) Prove |error| < 108for \x\ < \ i r. You may assume 15! > 1.3 X 1012and
(^7r)15 < 103.
6. DERIVATION OF TAYLOR S FORMULA
We shall derive Taylor's Formula as stated in Section 3 :
f i x ) = Pnix) + rni x),
where
and
Pn(x) =h (x ~
i =o
a y
rn(x)
= r (* - 0 ".
n\ Ja
/("+(o d<.
This is a consequence of the following assertion (actually a special case of
Taylors Formula):
If
gi a) = g' (a) = g" (a) = = gM (a) = 0,
then
1 [x
g( x) = - - (x - t ) ng
n\ Ja
(n+1)(<)
I f this assertion is* correct, how does Taylors Formula follow? Suppose
/ (:<:) is any function and p (x) is its n-th degree Taylor polynomial at x = a.
Set
g( x) = /(*) - p().
Now p n(x) agrees with f { x) , and its first n derivatives agree with those of
f ( x ) at x = a. Therefore
g( a) = g ' ( a ) = g" (a) = = g ( a ) = 0.
Thus g( x) satisfies the conditions of the assertion, so
g( x)
l f x
= . (x - t ) ng
n\ Ja
"+(<) dt.
But g( x) = f ( x ) p ( x) and g(n+1>(t) = / (n+1) (t) because the (n + l)-th
derivative of the polynomial pn(x) is zero. Therefore,
f ( x ) - pn(x)
= i r
n \ J a
(x - <)n/ (n+1>( 0 dt,
which is precisely Taylor's Formula.
Let us return to the assertion and verify it for low values of ft. We integrate
by parts, noting that a and x are fixed.
C a s e n = 0:
ja^~^ ^^=fa^^M= 9 ^
(Do not forget g( a) = 0.)
Ca s e n = 1 : To evaluate
= g( x) .
i r
a j .
(x - t ) g"( t ) dt,
x r x
/ v( t ) u' ( t ) dt .
a J a
x r x r x
+ / g ' ( t ) d t = / g' (t ) dt = g( x) ,
a J a J a
set u( t ) = (x t) and v(t ) = g' (t). Then
/ (x t)g" (t) dt = / udv = u( t ) v( t )
J a J o
Therefore
I (x - t)g" (<) dt = (,x - t ) g' ( t )
J a
by the previous case. (Do not forget g' (a) =0.)
Ca s e n = 2: Set u(t ) i ( x t ) 2 and v(t) = g"(t ). Then
^ I (x - t ) 2g' "( t ) dt = f udv
-J J a J a
= u (0 (0 " - J V ( i)u' (t ) dt = 0 + I ( x - t)g" (<) dt = g{ x) ,
a J a &
by the previous case. Note that v(a) = 0 by the hypothesis g" (a) = 0.
The general case is handled the same way. One integration by parts, with
u( t ) = T~~~~ and v(t ) = gM (t),
n\
reduces
j (z - t ) n9(n+1)(t) dt to - [ (x - t ) n~ Y n)(t) dt.
n\ Ja (tt - 1)! Ja
The latter is g( x) by the previous case.
3. Power Series
1. INTRODUCTION
I n this chapter we study power series
do + ai{x c) + d2(x c) 2 + + an(x c ) n +
and their applications. I n most of our examples c = 0, but the discussion
applies to c ^ 0 as well.
Power series serve two important purposes. First, they express known
functions in a form particularly suitable for computation. Second, they define
functions which are not simple to specify otherwise. Certainly nobody objects
to defining a function by a polynomial,
provided, of course, that the series converges? We shall see, in fact, that in
many ways power series resemble polynomials.
One useful power series is the geometric series
which converges to 1/(1 x) for \x\ < 1 (but diverges for |#| > 1).
Other examples of power series were discussed in the preceding chapter:
These series were derived from Taylor's Formula with Remainder. Each
f ( x ) = a0 + aix + a2x2 + + dnxn.
Then why not define a function by a power series,
f ( x) = d0 + diX + d2x2 + ,
00
n =0
converges for all values of x, whereas the geometric series converges only for
\x\ < 1.
Convergence and Divergence
For each fixed value of x, a power series is an infinite series of numbers.
Therefore convergence is defined just as it was in Chapter 1.
A power series oak(x c) k converges at a point x if the sequence of
partial sums
n
Sn(x) = ^ ak( x c) k
k=0
converges. The power series diverges at x if the sequence {sn(z)}diverges.
You can think of a power series as infinitely many numerical series at
once, one for each x. The series may converge at some points x and diverge at
others. I f it converges on the set of points D, then the sum of the series will
generally vary as x varies in D; in other words, the sum is a function F( x)
with domain D. We say that the power series converges to F( x) on D. Here is
the precise definition:
The power series Xo ak(x c ) k converges to the function F( x) on a set D,
if given x in D and any e > 0, there exists a positive integer N such that
n
^ak(x - c ) k - F( x)
k =0
< e for all n > N.
The number N tells how soon the partial sums are within e of F( x) , thus it
measures rapidity of convergence. The series may converge at different
rates for different points x ; so for e given, N may depend on x.
For example, take the geometric series xn, which converges to F( x) =
1 / ( 1 x) in the interval \x\ < 1. Suppose e = 0.01. We ask how large n
must be so that
n
F( x) ^ xk
k =0
For x = i , we require
/ l \ n+1 / 4 1 1 1
\5/ / 5 < 100 J 5n< 400 J n>
Clearly n > 4 will do, hence N = 4.
< e, that is,
1 1 - # + l
X n + l
1 x I X 1 X
< 0.01.
For x = 5, we require

+l / J
/ 2 <100
This time we need n > 7, so iV = 7.
For ft = T9o, we require
2n > 100.
/ 9V +1 / 1 J _
W / 10 < 100 \ 10/ > 1000
Using logarithms, we find that we need n > 65; in this case N = 65.
These results show that the geometric series converges less and less
rapidly (N increases) as x increases towards 1. Nevertheless it does converge
for each x in the interval 1 < x < 1.
Radius of Convergence
The set on which a power series converges turns out to be simple: it is
either a single point, the whole line or an interval. The proof of this statement
is based on the following fact.
Lemma If a power series XI anXn converges at x = xi, where xi j* 0, then it
converges absolutely in the interval \x\ < \xi\.
Proof: If 2 anxin converges, its terms approach 0. Hence the terms are
bounded, that is, there exists a positive number M such that |an#in| < M.
Now
/ x \ n X
=
anxin [ - ) < M
V i /
Xi
\anxn\ =
I f \x\ < |#i|, then \x/xi\ < 1and the series converges absolutely by comparison
with a convergent geometric series.
The proof of the next result will require a form of the basic completeness
property of the field R of real numbers: Each set of real numbers that has an
upper bound has a unique least upper bound (supremum) . This means that if S
is a set of real numbers such that x < B for some B and all x in S, then there is
a number L such that x < L for all x in S and no smaller number has this
property.
diverges - < converges - diverges -
c R c R
Fi g . 1.1
/
Theorem Given a power series X) an(x c)n, precisely one of the fol
lowing three cases holds:
(i) The series converges only for x = c.
(ii) The series converges for all values of x.
(iii) There is a positive number R such that the series converges for each
x satisfying \x c\ < R and diverges for each x satisfying |#c\ > R.
See Fig. 1.1.
Proof: We may take c = 0 to simplify the notation, that is, we simply
replace x c by x, a translation along the #-axis that takes c to 0. Now let D
be the domain of convergence of X) anxn. Certainly 0 is in D because X) an0n
converges no matter what the an are. We distinguish two cases.
Case 1: D is unbounded. Then there are numbers x\ with Yh anxin
convergent and |#i| arbitrarily large. By the Lemma, D contains the interval
\x\ < \xi\ in each such case. Since \xi\ can be taken arbitrarily large, this means
that D contains all real numbers, Case (ii) in the Theorem.
Case 2: D is bounded. Then the set S of all numbers \xi\ where X^anxin
converges is bounded above. Let R be the supremum. Thus, if X) anXincon
verges, then |#i| < R, and R is the smallest number with this property.
I t follows that S anxn converges for some points x with |#| < R, and
diverges for all points x with |#| > R. I t remains to show that the series con
verges for all x with \x\ < R. Suppose X) onx2n diverges; it is enough to show
that R < |#21By the Lemma, the series must diverge for every x with \x2\ < |#|
otherwise XI anx2n would converge. Therefore \x2\ is an upper bound for S, so
R < \x2\ since R is the least upper bound.
I f R > 0 we have (iii) of the Theorem. Otherwise R = 0 and we have (i).
The proof is complete.
Case (i) is an extreme case, unimportant and uninteresting. I t occurs when
the coefficients an grow so fast that the power series can converge only if all
terms after the first vanish. An example is
1+ x + 22x2 + 3V + 44#4 + .
For each non-zero x, note that \unxn\ = \nx\n-------- oo as n -------- oo.
Case (ii) is the opposite extreme and occurs when the coefficients a0, ai,
a2, become small very rapidly. The series for ex is an example. Here the
general term is xn/ n \ , which for each x tends to zero very quickly since the
coefficients an = 1/n! become small so fast.
Case (iii) lies in between. The coefficients do not increase so rapidly that
the series never converges (except for x c), nor do they decrease so rapidly
that the series always converges. A typical example is the geometric series
2 xn) where each an = 1. This series converges for |#| < 1 and diverges for
\x\ > 1, hence R = 1.
Note that in Case (iii) nothing is said about the endpoints x = c + R and
x = c R. The series may or may not converge at either point.
The number R in Case (iii) is called the radius of convergence. By con
vention, R = 0 in Case (i), convergence for x = c only, and R = oo in Case
(ii), convergence for all x. (The word radius will be clarified in Chapter 16,
Section 7.)
The interval c / 2<<c + /2i n Case (iii) is called the interval of
convergence. (Refer to Fig. 1.1.) By convention, the interval of convergence
in Case (i) is the single point c; in Case (ii) it is the entire x-axis.
EXAMPLE 1.1
Find the sum of the power series and its radius of convergence R :
(a) 1 + - +
(b) 1+ 5* + 5%2 + ------1- 5*xn +
Solution: Each is a geometric series
% y -
n =0
l
l - y
\y\ < !
In (a), y = x/ 3; hence the series converges for \x/S\ < 1 and diverges for
\x/3\ > 1, that is, converges for \x\ < 3 and diverges for \x\ > 3. Thus
R = 3. I n (b), y = ox, which implies convergence for |rc| < i and divergence
for |g| > i.
Note that the series with smaller coefficients has the larger radius of con
vergence.
EXAMPLE 1.2
Find the radius of convergence and the sum of the power series
Solution: The series has the form
y2 ys
1"l 2/ + 2! + 3! + ** Wher6 V = ~ X2'
Since the series converges to ev for all values of y, we may replace y by ,r2
for any value of x.
Answer: R = ; the series converges to e~z for all values of x.
EXERCISES
Find the sum of the series and its radius of convergence:
1. 1+ (x- 3)+ ( x - 3)2 + (x 3)3 + + (x- 3)H-----
, , + (l L ) + (i ?)> + + ... + ( +!) +
3. 1x2 -f- x4 x6 -|- -f- (1 )nx2n +
G K K ) +~+(i)'
+
E , I (x + 1) | (x + 1 )2 (x + 1 )3 (x + 1 )n
5. 1 + ^ j H------2!------1------3! + - + +
6 i - g + g? _ gg+ ... + ( - 2^)"+ ...
1! 2! 3! T T n!
7. 1 + (5x)3+ (5*)6+ (5x)9H--------1- (5a:)3"H-----
_ 8x3 32x5 128a:7 . . , . (2a:)2"-1 .
8. 2a:-------- ------------------- ! - + (1 )n_1 --------- h
3! 5! 7! T - r y (2n - l ) ! T
, ( X - I Y , ( X ~ 1)8 (x 1)12 , , (x - 1 )4n
9. 1 -------- + _ ---------------------- ( _l ) n
Find the sum of the series and all values of x for which the series converges :
10. - + -2+*li + H* +
X X2 X3 xn
11. 1 + ex + e2x + e3x + + enx +
12. cos2 x + cos4 x + cos6 x + + cos2nx +
, sin Sx . sin2 Sx sin3 3x . , , ^. sinw3x .
13 1 ----------------------------------- U... 4- (1 )--------- U. . .
1! 2! 3! K } n!
; . In x . In2 x , In3 x . . lnnx .
14- 1 + T r + ^r + ^r + *, , +^r + "
15. 1+ 2\ / x -f- 4x + 8x\ / x + 16x2-)- + i^l\/x)n +
16. \nx + In (x112) + ln(z1/4) H--------h ln(z1/2n) H-----
17. 1+ (x2 + a2) + (x2 + a2)2 + (x2 + a2)3 +
~2/3 /v.4/3 /v.6/3 /v.2n/3
19 1 5__ U 5___ ______ |_ . . . _L ( _ 1 ------------ L
2! 4! 6! ^ 1 j (2n)!
2. RATIO TEST
For most power series that arise in practice, the radius of convergence can
be found by the following criterion. Roughly speaking, if the coefficients of a
power series behave very much like those of a geometric series with radius of
convergence R , then the power series also has radius of convergence R. Such a
geometric series is X) anxn = X) (1 / R ) nxn. Note that for this series an/an+i = R
Ratio Test Suppose the power series
a0 + ai (x c) + a2(x c) 2 + + an(x c) n +
has non-zero coefficients. If
dn
dn+1
->R as n
where R is 0, positive, or 00, then R is the radius of convergence.
Proof: For simplicity in notation, let us assume c = 0. Suppose
Ircl < R. Then
\an+ix
:n+i|
\anxn\
&n+1
hence the series X) anXn converges (absolutely) by the ratio test for a series of
constants (Chapter 1, Section 3).
Suppose 0 < R < 00 and \x\ > R. Then
\dn-\-lX
n+I I
|ann| R
> 1,
so for n sufficiently large, {\anxn\}is an increasing sequence. Therefore 2 anXn
diverges because it terms do not approach 0. A slight modification of this
argument applies if R = 0.
Thus the series converges for |#| < R and diverges for |#| > R, that is, its
radius of convergence is R.
EXAMPLE 2.1
Find the radius of convergence in each case:
x x2 xs xn
(a) 1 + T + + + H-------f* s
1 2 6 n
(b) (x 5) 4(#5)2 + 9(#o)3 *
+ ( l ) n~ln2(x - 5 ) n +
A /y2 /v3 a<H
(C) 1 + ^ - + - ^ + - ^ + . . . + ^ + . . .
w 2 + 1 2 * + 2 2* + 3 2" + n
ar ar
(d) 1 - * + 5 ; - =- , + + ( - l ) - n +
22 3* nn
/ y 3 /y6 /y9 a3h
(e) * + - ^ + ... + ^ +
\/3 \ / 6 \/9 \/3^
Solution: I n each case apply the Ratio Test,
(a) Here an = 1/n and
(In+l
- I / 1 _ + 1
n / n + 1 n
Hence
ttn+l
so /2 = 1.
(b) an=
1 as n ->00,
an
n2 if \
Cln+l
(n + l )2
U + 1/
->1 as tt
so R = 1.
(c)
^n+1
= - L - / _____ I _____=
2" + n / 2n+1 + n + 1
2 + in + 1) *2- n
2n+1 + n + 1
2" + n
2 + 0
as n
1+ n-2~n
. Hence R = 2.
1 + 0
= 2
80 3. POWER SERIES
(d)
= - / ------------
n" / ( + l )n+1
I d n
I ^n+1
( + 1)"+1_ = (n + 1} > w +i.
Hence \an/ a n+i \ -------->oo as n --------> oo, so R = oo; the series converges for
all values of x.
(e) The Ratio Test does not apply directly because two-thirds of the
coefficients in this power series are zero. Nevertheless, the series may be
written
v . y2 , y3 . , yn ,
v^n + ' "
where y = xs. The Ratio Test does apply to the series in this form:
V 3 ( n + 1 ) = jn + 1
\ / S n * n
1.
Hence the ^/-series converges for \y\ < 1 and diverges for \y\ > 1. Therefore,
the original series converges for \xs\ < 1 and diverges for |#3| > 1, i.e., for
|#| < 1 and |#| > 1, respectively. Hence R = 1.
Answer: (a) 1; (b) 1; (c) 2; (d) <; (e) 1.
R e m a r k : The ratio test does not apply to all series. An example is
1 + 2x + x2 + 2xz + x4 + 2xb + .
For this series
do 1 CLl 0*2 1 dz
= - , = 2, = - , = 2, etc.
di 2 a2 dz 2 a4
The ratios are alternately \ and 2 ; they do not have a limit and so the ratio
test does not apply. Nevertheless, the given series is a perfectly decent one.
I t is in fact the sum of two geometric series,
i
1 + x2 + +
1 - X2
and
2#
2x + 2xz + 2xb +
1 - x2
both of which converge for |#| < 1and diverge for |#| > 1. Hence
Find the radius of convergence:
1. 1+ x + 2x2 + 3rH--------\-nxn-\-----
2 3 n
/y2 /y3 /y4 rpTl
3. X + X- + ? r + ^ + - - - + * '
3 5 1 7 1 2n - 1
/v2 /v.3 /j*4
A " I *V IV / ^
4- * F + i T ~ 7 ^ + + ( 1) ;
x"
5 9 17 v 7 2" + 1
/V /YZ AU /yiTl
5- l + ^ + ^7 + # ^ + +
1-2 2-4 ' 3- 8 n-2n 1
6 ( #+i ) { (* + i )2 | ( x + i y | (* +n*
1-2-3 2-3-4 3-4-5 1 ' n(n+l )(n + 2)
7. x + \ / 2 x 2 \ / Sx s + - - - + y/ nxn + - - -
8 . i _ 2 + 12_ * + . . . + (-*)" i
2 24 29 2n!
9. (e - 1 )x + (e2- 1 )x2+ (e3- 1 )x3H---------1- (e - 1 )x +
. o . - | + S - S + - + ( - D - ( +I ) -
1* 1 23 33 v ' n3
1 , 3(* - 5) , 32(x 5)2 , , 3" (x 5)" ,
1L 4 -1------4?----^--------43-------r ----- h ----H------
12. l + x + 2!x2+ 3!x3H------ . + n!x" H-------
/y2 /y3 /v4 /ytW
10 *____I___ ~____L _______ L . #. J ___ _____ L
' 2 + In 2 3 + In 3 4 + In 4 n + l n n
, , 1 ,1*3 2 , 1-3-5 . , , l - 3- 5- - - (2n- 1) .
14. 1 + - X+ X2+ - X3+ H----- ----- Xn +
2 2*4 2-4*6 2*4*6*'*(2n)
15' 30+ (52 62) X + ( s 3 63) *2"* *"(5n+l 6n+1) X"
16. 1 + x2 + x10 + x12 + x20 + x22 H--------h x10" + x10"+2 H-----
17. 4-5x4 + 8-9x H--------1- 4n(4n+ 1)x4n-\-----
18. 1 + (l + 2)x+ (l + 2 + 4)x2 + (l + 2 + 4+8)x3H-----
+ (1 + 2 + 2M--------[- 2")xnH-----
+
EXERCISES
3. EXPANSIONS OF FUNCTIONS
For many applications, it is convenient to expand functions in power
series. We have already done this for ex, sin x, and cos x by computing the
coefficient of xn from the formula
Generally, however, computation of higher derivatives is extremely laborious.
Try to find the seventh derivative of tan# or of 1/(1 + #4) and you will
soon agree.
I n this section we describe several techniques for deriving power series
without tedious differentiation. They all depend on one basic principle:
Uniqueness of Power Series If f ( x) has a power series expansion
convergent in some interval \x c\ < R, then
Thus there is only one possible choice for the coefficients; the series is
unique.
Once you find a power series for / ( x) at c by any method, fair or foul, then
you have it! There is no other series for/(#).
Proof: The proof is based on a property of power series that will be
discussed in Section 4: within its interval of convergence, a power series can
be differentiated term-by-term, infinitely often. Having this, the rest is easy. If
/(#) = 2 an(x c) n, we differentiate n times, then set x = c:
/(n)( 0 )
an =-----:
This is exactly the same method we used in Chapter 2, Section 2 to derive the
coefficients in the Taylor polynomials of /(#).
The first new technique we consider is addition and subtraction of power
series.
Two power series may be added or subtracted term-by-term within their
common interval of convergence.
Proof: Suppose f ( x ) = an(x - x0) n and g( x) = Y o bn(x - x0) n.
I f sn (x ) and tn (x ) denote the partial sums of these series then sn ( x ) -------->f (x)
and tn( x ) -------->g(x) for each x in the common interval of convergence. But
by a basic theorem on limits, sn(x) db tn(x) -/(#) =fcg( x) . That means
in other words, Y o (an bn) (x #0)n= f ( x ) =hg{x).
EXAMPLE 3.1
Express cosh x in a power series at x = 0.
Solution: cosh#= \ ( e x + e~x). But
x2 #3
' - 1 + * + 2i + 3i + *
for all x,
X* X6
e_I = 2i ~ 3!
Add these series and divide by 2 :
for all x.
/v2 <y4
- (e* + e- , ) = ! + - + - +
X* X*
Answer: cosh #= 1 H : + t: + for all #.
2! 4!
The next technique is formal multiplication of series. To simplify notation
we shall stick to c = 0.
Theorem Let
OO 00
/(#) = ^ anxn and g(x) = ^ bnxn
n=0 n=0
for |#| < R. Then the product f ( x) g( x) has the power series expansion
also valid for I#I < R.
The theorem means that the power series for f ( x) g (x) is obtained by mul
tiplying each term atf* of f ( x) by each term bjx3' of g (x) and collecting terms,
just as in multiplying polynomials. To remember the rule, start with the lowest
terms and work up:
f ( x) g( x) = (ao + aix + a2x2 + ) (&o+ bix + b2x2 + )
= ciobo + (dobi + a\bo)x + (aob2 + aibi + a2bo)x2 + .
The proof of this theorem for products is difficult and best postponed to an
advanced course.
EXAMPLE 3.2
Compute the terms up to x* in the power
series of x2ex sin 2x at x = 0 .
Solution:
Since only terms involving xe and lower powers are required, it suffices to
compute the product
=x2 ^2x + 2x2 + ^1 - 0 x* + 0 - 0 x* + j .
Answer: For all .r,
x2ez sin 2x = 2xs + 2x4 ~x5 x%+
o
The next technique is substitution; it is used to find the power series for a
composite function from the series for the component functions.
Theorem Let
/ (z) = d o + CLiZ + a2z2 +
converge in the interval w < R. Suppose g (x) is a function with domain D
and such that \g(x)\ < R for all x in D. Then
(*) f l a( x) 2 = ao + axg{ x) + a2\ _g{x)J + a3[^(a; ) ] 3 +
If in addition g( x) is a power series with g( 0) = 0, that is,
g (x) = bix + b2x2 + &3#3 +
for |#| < r, then the series on the right side of (*) can be converted into a
power series by formally squaring, cubing, etc., and collecting terms. The
resulting power series is the valid expansion of f[_g(x)] for |x| < r.
3. Expansions of Functi ons 85
The proof of this result is definitely beyond the scope of this course.
Re ma r k : Note that g(x) lacks a constant term, that is, b0 = 0.
Without this restriction, there are infinitely many terms contributing to each
each coefficient of f[g(x)~], a tricky situation to handle.
The theorem often allows us to find series for various functions by simple
modifications of known series. Here are a few everyday examples:
= 1 + x2 + x4 + x* + (|#| < 1).
1 - a?
1
=1+ ( - 2 a * ) + (- 2x3)2+ ( 2#3)3+
1 + 2x*
= 1 2x* + 4x* Sx9 H---- -- (\x\ < l / \ / 2 ) .
( x2\ 1 ( x2\ 1 ( .r2\ 3
r x , /I I \
= 1 _ 2~+ 2^2!_ 2^3! + ------- (aU X)'
Let us now consider a less simple example.
EXAMPLE 3.3
Find all terms up to the term in x%in the power series at x = 0
of the function
/ ( * ) = ;-------- 7 7~JZ
1 sm ( x )
Assume \x\ < 1.
Solution: If |#| < 1, then |sin(#2)| < 1; hence by (*),
f ( x ) = 1 + si n(x2) + [sin (a;2) ] 2 + [sin(x2) ] 3 +
Now, from the series for sin xy we have
10
sin (x2) + +
6 120
We substitute this series into the expression ior f ( x) . We square, cube, etc.,
collecting powers of x up to the eighth power:
Answer: f i x ) = 1+ x2 + x* + f &6 + f#8 +
OcM and Even Functions
The power series at x = 0 for the odd function sin x lacks x2, x4, xQ,
Likewise the series at x = 0 for the even function cos x lacks x, xz, xb,
These examples illustrate a useful principle:
If f i x ) is an odd function, f ( x) = f i x ) , then its power series at x = 0
has the form
f i x ) = cl ix + a*x3 + a5x5 + .
I f g( x) is an even function, g { x) = g( x) , then its power series at x = 0
has the form
g(x) = a0 + a2x2 + a44 + .
These statements are easy to remember: an odd (even) function involves only
odd (even) powers of x at x = 0.
The proofs of the principles depend in turn on another pair of important
facts :
The derivative of an odd function is an even function; the derivative of
an even function is an odd function.
Reason: Suppose/( x) is odd, f ( x) = f i x ) . Differentiate, using the Chain
Rule on the left-hand side: f i x) = f i x ) . Hence f ' i x) = f i x ) , so
f i x ) is even. Likewise, if g i ~ x ) = gi x) , then g' i ~ x ) = g' i x) .
Now return to power series of odd and even functions. Suppose f i x ) is
odd and at x = 0 has the power series
f i x ) = a0 + cl ix + a2x2 + .
Since f i x ) is odd, /(0) = 0. Hence a0 = 0. The deri vati ve/'^) is even and
its derivative f ' { x ) is again odd. Hence /"(0) = 0. But a2 = /" (0 )/2 !, so
a2 = 0. Likewise aA = 0, a6 = 0, , so
f i x ) = dix + azx3 + a5#5 +
The statement about even functions can be proved similarly.

EXAMPLE 3.4
Find the power series for tan x at x = 0 up to terms in x7.
Solution: The power series for tan x is obtained by long division: the
series for sin x is divided by the series for cos x. Here is a systematic way to
carry out the long division. Set
tan x = cl ix + ct3xz + a5&5 + cax1 + ,
valid because tan x is an odd function. Write the identity tan x cos x = sin x
in terms of power series:
(diX + a3x3 + asx5 + a7a;7 + ) ( l - ^ ^ ^
X 6 + 120 5040 +
Multiply the two series on the left, then equate coefficients.
x: a\ =1;
1 1
a8 - - a 1 =- - ;
f l s_ r j + i fll = i F o;
i i i i
x 7 (- ei3 ai =
2 24 720 5040
Solve these equations successively for a\, a3, as, (ii:
1 2 17
a x - 1, a , - - , a6- - , a , - .
Alternate Solution: Write
1
tan x = sin x
cos x
Now
cos x
- i ( - ^ \ _ 1 _
\2 4! 6 ! " / *
cos x 1 z
1 = 1 + z + z2 + z * +
Compute up to xe only:
x2 x4 xe
Z = 2 ~~24 720
*! (f ~ f i + (!)*(' - + '"J f - +
The quantities z4, 25, can be ignored; they do not contain #6 or lower
powers of x. Collect terms:
COS
Therefore
1
tan x = sin x
cos x
* ( * i r + 4 ~ s S o + " ' X 1 + 1 + S * 1+ i o * 1 + " )
I + ( S + 0 * + ( i ! o ~ i i + i i ) *
/ 1 1 5 61 \ . ,
_ 1_( ---------- _J----------- ------------- L ------- 1 /v.7 _J_ .
\ 5040 240 144 720/
1 2 17 ,
Answer:
1 3 2 5 i 17 7 .
tan x = x + - x* + ~ x b + rrr x7 +
o 15 olo
R e m a r k : The power series for tan x converges to tan x for \x\ < t / 2 .
This is the largest interval about x = 0 in which the denominator cos x
is non-zero.
EXAMPLE 3.5
What is the value of the seventh derivative of tan x at x = 0?
3. Expansi ons of Functi ons 89
Solution: Denote tan x by f i x) . The coefficient of x7 in its power series
is / (7) (0)/7! But that coefficient is 17/315 from the answer to the preceding
problem. Hence
/ (7)(0) = } 7_ m ... _ (17) (5040)
7! 315 3 K ) 315
Answer: 272.
EXERCISES
Find the power series at x = 0 :
1. 2. *2
15x 1 -f- x3
3. ex3 4. cos 2x
5. cosh y/ x 6. x (sin x + sin Sx)
1 cos x
7. ( x - l)ex 8.
x2
9 * 1Q s mx - x
1 X X 6
11. sin2x 12. sin (a;2).
[Hint: Use a trigonometric identity.]
Compute the terms up to and including z6 in the power series at x = 0 :
13- ( 1 - 2 , HI - 3,2) 14.e*sm(**)
*
15. - 16 1
1 x2ex 1 + x2 + x4
17. sin3 x 18. In cos x.
Find / (8) (0):
19. f i x) = --i 20. f i x) = zcosx
1 x
21. f i x) = 22. f i x) = arc tan(z3).
2* 3/2 ^6
23. Show that the series 1 - + + converges to cos y/ x for x > 0.
2! 4! o!
24. (cont.) Show that it converges to cosh y/^^x for x < 0.
4. FURTHER TECHNIQUES
The methods of the last section enable us to derive new power series from
known ones. Here is another important method for doing so. We postpone the
proof until Section 7.
A power series may be differentiated or integrated term-by-term within its
interval of convergence.
EXAMPLE 4.1
Find power series at x = 0 for (1 x)~~2 and (1 )~3.
Solution: For |rc| < 1,
(1 - *)-2 = (1 _ *)-!
ax
= ( l + x + x2 + +* + )
dx
= J - ( 1 ) + J - (x) + J - (X2) + + J- (xn) +
dx dx dx ax
1 + 2x + Sx2 + + nxn~~l + .
Differentiate again:
(1 x) ~z = ~ (1 x) ~2 = - -7- (1 + 2x + Sx2 + + nxn~l + )
2 dx 2 dx
= i [2 + 6x2 + 12x* + + n( n - l ) x n~2 + ]
Answer: For \x\ < 1,
CD
(1 x)~2 = 2 ^ nxn~l, (1 x)~3
n =1
CO
I
n =2
n ( n ~ 1 )
-...- .....1 xn~2
2
EXAMPLE 4.2
Find the power series at x = 0 for In (1 + x ).
Solution: Notice that l n(l + x) is an antiderivative of (1 + x) ~l
1 x + x2 x3 + . Therefore
l n(l + x)
r x dt
Jo 1 + ^
%
= / [1 - t + t* - + (-i)< + ...]d
J o
= dt - t d t + t2dt 1) j tn
Jo Jo J l) Jo
v.n+1
Answer: For j#] < 1, l n(l + x)
+
y.n+1
+
EXAMPLE 4.3
<^3 /j*5
Sum the series x + V + zr +
3 5
+
2n
+
Solution: By the ratio test, the series converges for \x\ < 1 to some
function/(e). Write
+
Each term is the integral of a power of x. This suggests that f ( x) is the integral
of some simple function. Differentiate term-by-term:
f ( x ) = 1+ x2 + x4 + + x2n~2 +
1
1 - X2
Therefore, f ( x) is an antiderivative of 1/(1 x2). Since /( 0 ) = 0, it follows
that
f x dt 1 1 + t
~ J01 - t2 = 2 ln 1 - t
1 1 + x
- 2 b m ,
1. 1 + *
Answer: - ln -------.
2 1 - x
92 3. POWER SERIES
Sum the series x + 4a;2 + 9x3 + ** + n2xn +
Solution: Write
f ( x) = x + 4a;2 + 9a;3 + + n2xn + = %g{x),
where
g( x) = 1 + 22x + S2x2 + + n2xn~l + .
Now
g( x) = -7- (x + 2a;2 + 3a:3 + 4a;4 +. + nxn + )
dx
= \_x (1 + 2x + 3a;2 + 4a:3 + + nxn~l + )]-
dx
By Example 4.1,
1+ 2a; + 3a:2 + + nxn~l + = (1 x)~2.
Hence
so
, x d f x "1 1+ X
ir=
f ( x ) = xg( x) =
x ) 3
X + X 2
(1 - a;)3
. X + X 2
Answer:
(1 a:)3
Ch e c k : Evaluate the series at a: = 0.1:
1 4 9 16 25
/((U) = io + w +To* + To*+w +
According to the answer, the sum is
(0.1) + (0.1)2 0.11
( 1- 0. 1)3 (0.9)3
0.15089 16324.
I t is easy to make a convincing numeric check. Start with the first term and
add successive terms: 0.1, 0.14, 0.149, 0.1506, 0.15085, 0.15088 6, 0.15089 09,
0.15089 154, 0.15089 1621, 0.15089 16310.
The power series for 1/ (x 5) can be obtained from the geometric
series. J ust write
_L_________ 1_ _ _ 1 1
x 5 5 x 5 x
~ 5
and expand in a geometric series:
4. Further Techniques 93
Partial Fractions
which converges if \x/5\ < 1, that is, |#| < 5.
Combined with this trick, the method of partial fractions is useful in
finding power series for rational functions (quotients of polynomials).
E XAM P L E 4.5
1
Find a power series at x = 0 for f ( x ) = 7----- ---------7 .
(x 2 ) (x ~ o)
Solution: By partial fractions
---------------- ----- - ( ----------------- ) .
- 2 ) ( * - 5) 3 \a; 5 x - 2/ (x
Expand each fraction by the preceding trick:
<X)
l l V
X - 2 2 L 4 \ 2 ) L j 2n+1
o
v a*
Z/ 2n+1
for \x\ < 2,
^ - 5 2 x 0 fr w< 3'
0 0
Therefore if |#| < 2, both series converge and
i p ___ M = -V P __LV
3 \ x 5 x - 2 / 3 Zy \2n+1 o"+7
Answer: For |*| < 2,
i = I V ( - ^
(x - 2) (* - 5) 3 Af \2n+1 5+7
n = 0
E XAM P L E 4.6
1
Find the power series at x = 0 for
x2 + x 1
Solution: The denominator can be factored:
x2 + x 1 = (x a) (x 6),
where a and b are the roots of the equation
x2 + x 1 = 0.
By the quadratic formula,
- 1 + y/ B , - l - y / 5
a = --------------, b = ------- -------.
2 2
Notice that ab = 1 because the product of the roots of x2 + px + q = 0
is q. (Why?)
By partial fractions,
- 1
x2 + x 1 (#
--------- = _ L _ ( _ J ------------- ^ .
a)(x b) a b \ x b x a)
Suppose \x\ < |( 1+ y / 5) , the smaller of the numbers \a\ and |6|. Now
expand:
i i / y xn y xn \ _ V * / 1 1 \
; 1 a b \ 2*/ an+1 L j bn+1) L j a b \aw+1 bn+1)
x2 + x , . .
n =0 n =0 n =0
Note that
, - 1 + V 5 - 1 - V 5
a b = ------------------------ ------- = y5,
and (since ab = 1)
1 , 1+ \/5 1 l - y / 5
r - '
R e m a r k : I t may not seem so, but the numbers cn are actually integers!
The first few are 1, 1, 2, 3, 5, 8, 13, ; each is the sum of the previous two.
See Ex. 17 below.
SEXAMPLE 4.7
Find a power series for ----------------~.
(1 #) (1 + #2)
Solution: By partial fractions,
R e m a r k : This example also can be done by multiplying together the
series for (1 x )~1 and (1 + x2)~l, also by multiplying the series for
(1 &4)-1by 1+ x.
(1 - * ) ( 1 +* 2)
1
x6 H---- -- ) + (x xz + x5 x7 H---- )]
= 1+ x + x4 + x5 + x8 + x9 +
Answer: For |a?| < 1,
EXERCISES
Obtain the power series at x = 0:
x25x + 6
2.
(2x + 1) (3a; + 4)
1
3.
1
4.
1
( x - l ) ( x - 2 ) ( x - 3) ( x - 3 ) ( x 2+ l )
1 + x2
Find the sum of the series:
7. 4 + 5x + 6x2 + + (n + 4
8. 1+ 4x + 9x2 + + (n + 1
+ (n + 4 )xn +
+ (n + 1)2xn +
[Hint: Differentiate.]
10 ____ _ H ) V + ,.,
1-2 2*3 3*4 (ft 1)n
11. 2 + 3a; + 4x2+ + (ft + 2)xn +
[Hint: Multiply by x and integrate.]
12. Evaluate l + ? + A + A + ... + J L +
[Hint: Consider ^ ftzn1.]
n=1
Verify by expressing both sides in power series:
13. y- (sina;) =#cosa; 14. ~ (e- *2) = 2xe~x' 2
dx O/Xf
15. I x2 cos x dx = 2x cos x + (x2 2) sin x + C
s7
, / 16. I arc tan x dx = x arc tan x ^ln(1+ a:2) + C,
17. Consider Example 4.6 froma different viewpoint. Suppose, a0, ai, a2, is a
sequence of integers satisfying a0= a\ = 1and an+2 = an + an+1 for ft > 1. Set
f ( x) = a0+ aix + a2x2 + . Show that xf(x) + x2f ( x) = f ( x) 1, and
conclude that fi x) = (1x a;2)-1. Now obtain a formula for an. (This is the
method of generating functions.)
18. Which power series is more efficient for computing ln(), the one for ln(l + x)y
or the one for ln[(l + x)/ (I a;)]? See Examples 4.2 and 4.3. Compute ln(f)
to 5 places.
19. Expand arc tan a; in a Taylor series at x = 0.
[Hint: Integrate its derivative.] Conclude that
arc tan x = x ^ =-+ for \x\ < 1.
o 5 7
I t is known that the series in Ex. 19 is valid for x = 1, hence that
3 5 7 4
This is an interesting but poor formula for computing 7r, since its terms decrease very
slowly. Exercises 20-23 develop an efficient way to compute t .
20. Prove the formula arc tan x + arc tan y = arc tan for |a;| < 1and

/ j r+j A
\1 xy)
7T 1 1
| y | < 1. Conclude that - = arc tan - + arc tan - .
4 o
21. (cont.) Use this expression for 7r/4 and the power series in Ex. 19 to compute w to 4
places.
22. (cont.) Show that the expression in Ex. 20 for 7r/4 can be modified to
7T 1 1 . .
- = 2 arc tan - + arc tan - . Why is this even better for computing 7r?
23. (cont.) Finally modify the expression in Ex. 22 to
j = 2 arc tan - + arc tan ^+ 2 arc tan i . Now compute 7r to 7 places.
4 D 1 o
5. BINOMIAL SERIES
The Binomial Theorem asserts that for each positive integer p,
/ t , , , P(P ~ 1) , , P(P ~ 1) (P 2 )
(1 + a:)p = 1 + pa; H------- ---- x2 H------------- ---------- a:3+
, P(P ~ 1)(P ~ 2)---(p - n + 1) , , p!
H--------------------------------------------------a;" + H----- -a:'.
n! p!
Standard notation for the coefficients in this identity is
. ( p \ _ j /p\ = P(P - 1)(P - 2)-- (p - n + 1) 1 < n <
\0/ \ n j n\ n P-
With this notation the expansion of (1 + x ) p can be abbreviated:
(i + * ) p = ( ) *
n =0
A generalization of the Binomial Theorem is the binomial series for
(1 + x) p, where p is not necessarily a positive integer.
B in o m ia l S e ri e s Suppose p is any number. Then
(1 + x) p = ^ xn, l < x < 1,
n =0
where the coefficients in this series are
/ p\ = l / p\ = p(p - l )(p - 2)---(p - n + 1) n > 1
\0 / \ n j n\
R e m a r k : I n case p happens to be a positive integer and n > p , then
the coefficient
98 3. POWER SERIES
equals 0 because it has a factor (p p). I n this case, the series breaks off
after the term in xp. The resulting formula,
(1 + x) p = xn,
i= 0
is the old Binomial Theorem again. But if p is not a positive integer or zero,
then each coefficient is non-zero, so the series has infinitely many terms.
The binomial series is just the Taylor series for y( x) = (1 + x) p at
x = 0. Notice that
y' (x) = p(l + *) pl,
y" (x) = p( p - 1)(1 + x ) p~2,
y(n) (x) = p(p l ) (p 2) (p n + 1) (1 + x) p~n.
Therefore the coefficient of xn in the Taylor series is
0) p(p - l )(p - 2)-- (p - n + 1)
n\ ft!
The binomial series converges for |rc| < 1. When p is an integer this is
obvious because the series terminates. When p is not an integer the ratio test
applies:
&n+l
-l O / U O l -
ft + 1
p ft
1,
so the radius of convergence is R = 1.
This, however, does not prove that the sum of the series is (1 + x) p.
That requires a delicate piece of analysis beyond the scope of this course.
E XAM P L E 5.1
Find the power series for
(1 +* ) 2
Solution: Use the binomial series with p = 2. The coefficient of
xn is
(; 2) -
( 2)( 3)( 4) - - . ( 2 - t t + l )
nl
ft!
5. Binomial Series 99
Therefore
(1
n =0 n =0
I 2x + Sx2 4#3+
Answer: For \x\ < 1,
(1 + x) i
n ~0
(- 1 )*( + l ) x n.
Ch e c k :
(1 +
I = - J - ( r ~ ) = - y (1 - X + x2 - x3 + - )
x)2 ax \1 + x / dx
= ( 1+ 2x Sx2 H---- ).
E XAM P L E 5.2
Express
\/l + x
as a power series.
xn is
Solution: Use the binomial series with p = The coefficient of
(7) - - - - -
= ( - 1)
n\
1-3-5 (2n - 1)
2"-n!
But
135(2n - 1) =
1-2-3-4---(2w) _ (2n)!
2-4-6-8---(2n) ~~2!'
Hence the coefficient of a:" is
2n*n! 2n*n! (2wn!)2
The following example will be used in Section 7.
E XAM P L E 5.3
Express y / \ x as a power series.
Solution: Use the binomial series with p = J and replace x by x.
The constant term is
= 1.
The term in xn is
S X -S X -H -W
(-* )" =
n\
{ ~ x ) n
, , 1.3-5---(2 - 3) ,
= (- 1)"- 1-------- -------------- - ( 1)**".
2"*n!
(-1)"
But
1*3*5- (2n - 3) = :
l*2-3*4*(2n 2)
2-4-6-8--*(2w - 2)
(2n 2)! (2n)!
2n-t(n 1)! (2-n\) (2n 1)
Therefore the term in xn is
(2)! 1
(2>n!)(2n 1) 2n>w!
a ; " =
(2n)!
(2"-n!)2(2n - 1)
x".
Answer: For |s| < 1,
a/1 x = 1
X (2'
( 2 n) t
nl )2(2n 1)
E XAM P L E 5.4
Estimate a^IOOI to 7 places.
Solution: Write
^lOOl = [1000(1 + lO"3)]1'3= 10(1 + 10~3)1/3.
Use the binomial series with x = 10-3 and p = :
10(1 + 10-3)1'3= 10 [1+ 0 (10-3) + Q (10-3)2+ ]
10-3 - ~106+
y
1
5. Binomial Series 101
The first three terms yield the estimate
A^lOO! 10.0033322222-
The error in this estimate is precisely the remainder
10
(3) (io-3)3+ 0) (10- 8)<+ . . . j .
Now each binomial coefficient above is less than 1in absolute value:
1 2 5 3n4
3 6 9 3w
< 1.
1*2*3* -n
Therefore, the error (in absolute value) is less than
i o[(i o-3)3+ (10-3)4 +. . . ] = io-8[i + 10-3 + 10-+ 10-9 +] '
1
10-
1 - 1 0- 3
10- 8 .
Answer: To 7 places, y / 1001 ~ 10.0033322.
Remar k : The estimate is actually accurate to at least 8 places. When we
estimated each binomial coefficient by 1, we were too generous since
10 \ \ V2- 5 = _
,n/ 3-6-9 162
n > 3.
(Why?) Therefore, the error estimate can be reduced by a factor of iVV,
which guarantees at least 1more place accuracy.
Expand in power series at x = 0:
1
1.
3.
(1+ x)3
1
(1 - 4a;2)2
5. (1+ 2a^)1M
2.
1
(1 - x ) 113
4. y / T X
[Hi nt ,
EXERCISES
1
(3 - 4x2)2
Expand in power series at x = 1:
7. vT T x [Hint: Write l + z= 2 +( z - l ) =2 ^1 + .]
1
(3 + x)2'
Compute the power series at x = 0 up to and including the term in x4:
1
9. y / l + x2ex 10.
11. (sin 2z)\/3 + x 12.
cos 2x
lo.
(1+ sin x)2
1
\ / l + x + x2
[Hint: Write x + x2= u.~\
(i+i)
Compute to 4-place accuracy using the Binomial Theorem:
14. 15. ^82
1fi 1
(1.03 )5
17. Expand arc sin x in a Taylor series at x = 0.
[Hint: Integrate its derivative.]
6. ALTERNATING SERIES
I n Chapter 1, Section 4, we saw that alternating series whose terms
approach 0 in absolute value are convenient to work with. I n particular,
remainder estimates are very simple and quite accurate: if you break off such a
series, then the remainder is less than the absolute value of the first term
omitted.
E XAM P L E 6.1
The power series at x = 0 for In (1 + x2) is broken off after n
terms. Estimate the remainder.
Solution: The series is obtained by integrating from 0 to #the series
for the derivative of ln(i + t2).
d- [ln(l + O ] = = 2t (l + ------- )
= 2{t - t3 + t* - f + - ), |<| < 1.
6. Alternating Series 103
I t follows that
. , f 2t dt oA 2 ** \
l n( l + a;8) = / --------= 2 ( ------------- -------------- ----- )
J 0 1 + <2 \2 4 6 8 /
a;4 a:6 a;8
= * ~ 2 + 3 ~ I + ------- W < L
This series alternates; since |rc| < 1, its terms decrease in absolute value
to zero. Therefore, the remainder after n terms is less than the absolute
value of the (n + l)-th term.
/^2n-f2
(remainder! < ------ - .
n + 1
Occasionally we encounter an alternating series whose terms ultimately
decrease in magnitude, but whose first few terms do not. I f the successive
terms decrease starting at the A;-th term, the series will still converge, and the
remainder estimate is still validbeyond the A;-th term. The front end of
the series, up to the (k l)-th term, is a finite sum; it causes no trouble. The
important part is the tail end, that is, the series starting with the A;-th term.
I t is this series that we test for convergence.
E XAM P L E 6.2
The power series for e~x at x = 0 is broken off after n terms.
Estimate the remainder for positive values of x.
Solution: If x > 0 the power series
x2 xs
e- . , i
alternates. If 0 < x < 1, the terms decrease to 0, and the above remainder
estimate for alternating series applies. If x > 1, however, the first few terms
may not decrease. (Take x = 6 for example:
e-6 = 1 _ 6 + | _ ^ + _ ... = 1_ 6 + 18 _ 36 + _
Nevertheless, for any fixed x, the terms do decrease ultimately.
To see why, note that the ratio of successive terms is
rn+l
(tt + 1)
Take a fixed value of x. Then
/
xn _ x
tt! tt + 1
104 3. POWER SERIES
as soon as n + 1 > x; from then on the terms decrease. Furthermore
x 1
u -f- 1 2
as soon as n + 1 > 2x. From then on each term is less than one-half the
preceding term. Hence, the terms decrease to 0.
Result: if the series is broken off at the ft-th term, the remainder estimate
for an alternating series applies, provided n + 1 > x.
Answer: For n > x 1,
|remainder| < :..... .
(ft + 1)!
As another application of this method, recall Example 5.4. There it
was shown that three terms of the binomial series for
10(1 + lO"3)1'3
provide 7-place accuracy in computing \/l001. But this series alternates
and its terms decrease towards 0. Therefore, one deduces that the remainder
after three terms is less than the fourth term, which is
so 8-place accuracy is assured. This estimate is both easier and more precise
than the one in Example 5.4.
X
xn
converges inside the interval 1< x < 1. Show that it
n
converges also at x = 1.
2. Give an example of an alternating power series that does not converge.
3. Compute e-1/5 to 5-place accuracy.
jC2
4. Find values of x for which the formula cos x t t 1+ yields 5-place
accuracy.
5. How many terms of the power series for ln x at x = 10 are needed to compute
ln(10.5) with 5-place accuracy? (Assume the value of ln 10 is known.)
6. How many terms of the binomial series for y / l -f- x will yield 5-place accuracy
fm- o < x < 0.1 ?
7. APPLICATIONS TO DEFINITE INTEGRALS [optional]
Power series are used in approximating definite integrals which cannot
be computed exactly.
EXAMPLE 7.1
Estimate / e~i2 dt to 6 places for \x\ < .
J o
7. Applications to Definite Integrals 105
Solution: A numerical integration formula such as Simpsons Rule
can be used, but a power series method is simpler. Expand the integrand
in a power series:
t4 <6
- 1 - ,! + 2! - 3! + 2! 3!

## Since this series converges for all x, it can be integrated term-by-term:

f i r s * )
dt
X 6 X X 1 X y
x ~ 3" + 5^2! - 7^3! + 9T ! ~
Because of the large denominators, this series converges rapidly if x is fairly
small. For |#| < J , the sixth term is at most
1
< 4 X lO"7.
\ 2 / 11-5!
Since the series alternates, it follows that five terms provide 6-place accuracy.
Answer: For all x,
/
Jo
e~<2dt = x X' ' X'
3 + 5-2! 7-3! +
For \x\ < J , five terms yield 6-place accuracy.
R e m a r k : The series converges for all values of x, but for large x it
converges slowly. For example, it would be ridiculous to compute
I" J o
e~l dt
by this method, since more than 100 terms at the beginning are greater than 1.
106 3. POWER SERIES
E XAM P L E 7 .2
[*/2
Express the integral / \ / l k2 sin2 where /c2< 1, as a
Jo
power series in ft.
Solution: By Example 5.3,
00
V 1 X = (1 ~ ^)1/2= 1 ^ |#| < 1,
n =1
where
(2)l
(2"-n!)2(2n - 1) '
Substitute ft2sin21for re. This is permissible because ft2sin2t < k2 < 1.
00
(1 ft2sin2)1/2= 1 ^ anft2nsin2n
n=1
Now integrate term-by-term:
/"/2 tt V f*'2
/ (1 /c2sin21)112dt = - y ank2n / sin2nt dt.
n =1
The integrals on the right are evaluated by a reduction formula (see p. 45),
and their values are listed inside the front cover:
fir/2
/ (si
Jo
/ (2n) ! *
sin xYndx = ---- - ,
v ' (2n-n\ )2 2
Therefore
r 2 2n, W, - f ( 2 a) I I f ( 2n) l t t ]
aio Sm I (2"-n!)2(2n l)J |_(2"-n!)2 2J
_ r (2n)I ]2 1 *
L(2"*n!)2J (2n 1)' 2
Substitute this* expression to obtain the answer.
[*! 2
Answer: For k2 < 1, / \ / 1 ft2sin21dt
Jo
= ^ (i v r (2^ ) ; T fc2w |
2 1 Z / [ ( 2- n! ) 2J 2n - lj *
n =1
R e m a r k : This integral arises in computing the arc length of an ellipse.
Suppose an ellipse is given in the parametric form
x = a cos 0, y = b sin 9,
where b > a. I f s denotes arc length,
(I ) - (I ) + ( t ) - < - *>+
= 62(sin2 9 + cos2 9) (b2 a2) sin20 = 62(1 k2 sin2 9),
where k2 = (b2 a2)/ b2. The length of the ellipse is
r / 2 / / 7 o \ r w 2 ___________________________
4 J j d9 = 46 J y / l k2 sin2 9 d9.
7. Applications to Definite Integrals 107
E XAM P L E 7.3
u r - i -
Estimate ~ I -x/1 * sin2 x dx to 4-place accuracy.
Solution: By the last example with k2 = 5,
where
f (2n)! I2 p . 3 - 5 . . . ( 2 - 1)T
" L22"(n!)2J ~ L 2- 4- 6- - - ( 2n) J
Break off the series after the p-th term. The remainder (error) is
- - i * -t y r
n =p +l
The problem: choose p large enough that |ep| < 5 X 105.
The error is a complicated expression. I t can be enormously simplified
in two steps, each of which causes a certain amount of over estimation.
First, each coefficient bn is less than 1:
n.3-5---(2n - 1)T T1 3 5 2n - I T
w L 2 4 6 (2n) J _ l2 4 6 2n J
< l ;
hence
i i y n x r
n=p + 1 ' * n= p+ l
|6,
Second, for n > p + 1,
108 3. POWER SERIES
1 1 1
<
2n - 1 2(p + 1) 1 2p + 1
hence replace l /(2n 1) by l /(2p +1):
"'<X feViXS'-sV, I (IT
n p -\-1 n =p +1
On the right is a geometric series; consequently
< 2?TT 0 ) [ +(0 +0 ) +
1 / l \ p+1 1 1
:T l \5/ 1- i 4(2V + 2p + 1\5/ 1- 4(2p + 1)5*'
To obtain 4-place accuracy, choose p so that
4(2p + 1)5- < 5 X
To find a suitable value of p, take reciprocals:
4(2p + 1)5P > 20000, (2p + 1)5P > 5000.
Trial and error shows
(2-3 + 1)53= 875, (2*4 + 1)54= 5625.
Accordingly, choose p = 4 and estimate the integral by
1
- '- ow - gs)w - (mm
1 Q 1 7
_ 1_ __ _ ___ _ ___ _ _____ f) OztQO
20 1600 6400 409600 ~
Answer: 0.9480.
EXERCISES
Express as a power series in x:
* sin t . * 1 cos t
4. / sin (t2) dt.
Compute to 5-place accuracy:
/
0.2
e*2dx
r 1/4 _____________
7. / y/ 1 + x* dx
Cz. 01
8. I dx [Expand at a; = 3.]
/ 3.01 ear
/s.00 1+ 3
J 3.00 1+ 3
9. Compute to 4-place accuracy the arc length of an ellipse with semi-axes 40 and
[Hint: Approximate the integral by the first significant term of its power series;
use Ex. 3.]
11. (cont.) Refine your estimate of x to 4-place accuracy by taking the first two
significant terms of the power series. Use Newtons Method to solve approximately
the resulting equation.
8. Uniform Convergence [optional]
Any basic information we need about the convergence of an infinite
series J2fn(x) of functions is carried in the corresponding sequence {$(#)} of partial sums. I n this section we shall study some important properties of sequences of functions, then apply the results to power series. D e fi n i ti o n Let {un(x)) be a sequence of functions, all with the same domain D . The sequence is said to converge uniformly on D to w(a;) if given any e> 0, there exists an N such that for all n > N and all x in D . The last four words are the key to this concept. We can control the degree of approximation of un(x) to u(x) independent of x. The next three results show the usefulness of uniform convergence. Th e o re m 8.1 Let {un(x) }be a sequence of continuous functions on D and let un( x ) --------u(x) uniformly on D . Then u(x) is continuous on D . Proof: Let e > 0. Then there is an N such that \ u n ( x ) u( x)| < i* for all x in D . Take any point c of D . Since un is continuous at c, there is 8 > 0 such that () w a t ( c ) | < |e for all x in D such that \x c\ < 8. 41. 10. Estimate the value of x for which |Un(x) ~ U { x ) \ < If x is in D and \x c\ < S, then \ u(x) w(c)| = \u(x) un (x) + u n (x ) u n (c ) + m( c ) u(c) | < Iu(x) Un {x ) I + \u n (x ) Un {c )\ + |mat(c) u(c)\ < + h + = This proves the continuity of u at c. Th e o re m 8.2 Let {un( x) | be a sequence of continuous functions on a closed interval [a, 6], and suppose un(x ) -------->u(x) uniformly on [a, 6], Then 110 3. POWER SERIES r . r / u{x) dx = lim / un J a n-+ oo J a (x) dx. Proof: The function u is continuous by the previous theorem, hence integrable. Let e> 0. Then there is an N such that \un(x) u(x)\ < e/ (b a) for all n > N and for all x in [a, 6]. If n > N, then r f h i i r i / Unix) dx / u{x) dx = / [ un(x) u{x)~] dx\ J a J a, \ \ J a ' f b f b e < / \un(x) w(a;)| dx < / -------dx = e. J a J a O a Hence f b f b / Un(x) dx --------> J a J a Cb u{x) dx. Th e o re m 8.3 Let {un( x ) } be a sequence of continuous functions on an open interval (a, b) and suppose un( x ) --------u{x) for each x in (a, b). Assume also that each un(x) is differentiable, that the derivatives un'{x) are con tinuous on (a, b), and that un' ( x ) -------->v{x) uniformly on (a, b). Then u{x) is differentiable and u'(x) = v(x). Proof: Fix c in (a, b) and let x be any point of this interval. Then j r Un {t) d t -------->J' v it) dt by the previous theorem. Hence Unix) ~ Unic) --------> Vit) dt. 8. Uniform Convergence 111 But un(x) --------u(x) and un(c) -------- >w(c), so u(x) u{c) = / v(t) dt. By the Fundamental Theorem of Calculus, u is differentiable and ur = v. R e m a r k : The last two results can be interpreted in terms of inter changing operations. Theorem 8.2 says that lim^*, and can be inter changed (under suitable hypotheses) and Theorem 8.3 says that limn-oo and d/ dx can be interchanged. Note the key role played by uniform conver gence in these results. I t is easy to translate the results above into statements about infinite series of functions. The starting point is the elementary fact that everything reasonable holds for finite sumsthat is what the partial sums of a series are. Th e o re m 8.4 Let { f n( x) } be a sequence of continuous functions on a domain D and assume J2n=i f n( x) converges uniformly on D to F ( x ) . Then ( 1) F( x ) is continuous on D . (2) If D = [a, 6], then F( x ) is integrable on [a, 6] and (3) Suppose the functions f n (x ) have continuous derivatives and Y fn (#) converges uniformly to G(x) on an open interval D = (a, b). Then F(x) is differentiable and Infinite Series n =1 F ' ( x ) = G ( x ) = /.'(* ) n =1 for each x in D. Proof: Let sn(x) = 2 * =i /* () Then sn(x) -------->F (x) uniformly on D . But sn(x) is continuous, being a finite sum of continuous functions; hence F(x) is continuous by Theorem 8.1. This proves (1). 112 3. POWER SERIES Now F(x) is continuous, hence integrable on [a, 6]. By Theorem 8.2, / F(x) dx = lim / sn(x) dx = lim / / /* ( #) dx J a n-+ oo / o n-*oo J a J k =1 6 = lim V f f k(x) dx = V f fk( x) dx. n-+ oo ' T */ a a fc= l k 1 This proves (2). By Theorem 8.3, the function F(x) is differentiable and n [ ^ / * ( ^ ) ] A- =1 F' (x) = G(x) = lim s'(a:) = lim -7- n-*oo dX n co = lim Y /* (* ) = V /t(z)- n~* A: =1 k = l This proves (3). Re ma r k : Parts (2) and (3) of Theorem 8. 4 are again results about interchanging the order of operations. They may be stated formally as follows: 00 00 (2) l f n(x) dx - u f n(x) dx, n =1 (3) n 1 n =1 The M-test In applying these results, the first step is always proving the uniform convergence of some series. Often we make use of the following criterion for uniform convergence, called the Weierstrass M-test. Th e o re m 8.5 (M -te st) Suppose f n(x) are given on a domain D and there are constants Mn such that (1) Y Mn converges, (2) \ f n(x)\ < Mn for all x in D and all n. Then J2fn(x) converges uniformly on D . Before reading the proof, it is advisable to review the Cauchy Test in Chapter 1, Section 2. 8. Uniform Convergence 113 Proof: Let e> 0. By the Cauchy Test, there is an N such that M n+i + M n+2+ + M m < whenever m > n > N. Therefore 2 /*(*) < 2 I/* (* )!< 2 M * <| /c=n+l A:=n+l k =n+l for all x in D whenever m > n > N. By the Cauchy Test again (if part), for each x in D, the series X) /*(#) converges to a number F(x). Now let -------->oo in the last displayed in equality: ' k =n-f*l for all n > N and all x in D. An equivalent statement is n F{x) - % f k(x) k =l that is, |F() - 8n(x)\ < e for all n > N and all x in D, where sn(x) is the n-th partial sum of J^fk(x). This says that the series converges uniformly to F(x) on D. Power Series Let us derive the main facts about integration and differentiation of power series. First we need a result on the uniform convergence of power series. Theorem 8.6 I f a power series Y an(x c) n converges in the interval \x c\ < R, then it converges uniformly in each smaller interval \x c\ < r, where 0 < r < R. Proof: I f \x c\ < r, then |an(x c)n| < \anrn\. Since r < R, the series Y arJ,n converges absolutely. Therefore the M-test applies with Mn = \anrn\. Done. Next we need uniform convergence of the series Y l nan{% c)n_1 of term-by-term derivatives. Theorem 8.7 I f a power series Y an(x c) n converges in the interval \x c\ < R, then the series Y nan{x c)n_1 converges uniformly in each smaller interval \x c\ < r, where 0 < r < R. Proof: By Theorem 8.6, it suffices to prove that the power series X nan(x c)n_1converges in \x c\ < R. Let \x c\ < R. Choose r so that \x c\ < r < R. Set b = \x c\/r, so 0 < b < 1. Then nbn~l -------->0 (by Lhospitals rule for instance); hence ribn~l < r for n sufficiently large. This implies |n(x c)n~l| < rn, |nan(x c)n_1| < \anrn\ for n sufficiently large. But X) anTn converges absolutely since r < R. There fore nan(x c)n~~l converges by the M-test. Now we can state the main results of this section. 114 3. POWER SERIES Theorem 8.8 Let F( x) = X)an(rc c) n, where the power series con verges on the interval \x c\ < R. Then (1) F(x) is continuous on \x c\ < R. (2) F(x) is integrable on \x c\ < R and oo c>+' n =0 for \x c\ < R. (3) F(x) is differentiable on \x c\ < R and Ff (x) = ^ nan(x c)n~l. n = 1 Proof: Except for a small technical detail, everything follows routinely from the previous results. We cannot conclude right off the bat that F(x) is continuous on \x c\ < R because we do not know that Y an(x c)n converges uniformly on all of the interval \x c | < R. We know only that it converges uniformly on each closed interval \x c\ < r, where r < R. But that is enough, for if 0is any point such that |#0c\ < 72, we can choose r so that \x0 c\ < r < R. Hence x0 belongs to the open interval \x c\ < r. But Theorem 8.6 shows that the power series converges uniformly on \x c\ < r, hence F(x) is continuous on \x c\ < r. I n particular F(x) is continuous at x0. (Of course we have used the obvious continuity of the summandspoly nomialsan(x c)n.) The rest of the theorem is proved similarly. The last statement in the theorem can be applied to F', then to F';, etc. Thus F can be differentiated repeatedly, and dkF V ' - = / n(n 1)* (n k + 1)anxn~k for |.r c\ < R. dxk n k Re ma r k : We have carefully avoided the endpoints of the interval of 8. Uniform Convergence 115 convergence. If F(x) = 2 anxn has radius of convergence R, then the power series converges for \x\ < R, and we may not know what happens at x = dbR. But suppose the series converges at x = R. Then it can be proved that onxn converges uniformly on 0 < x < R, and hence that F(x) is continuous and can be integrated on 0 < x < R. This is a fairly deep result, and its proof is beyond our scope. I t is needed to prove such statements as . * dx 1 1 1 l n 2 * I + 7r -jC r J f r x , . d x 1 , 1 1 , j - a r c t a n l - / 2. Let fn(x) = EXERCISES 1. Let/n(:r) = xn. Show that the sequence {/n(a0) converges uniformly on [0, but not on [0, 1]. rnx, 0 < x < 1/n 2 nx, 1/n < x < 2/n 0 2/n < x. Prove that /(x) --------0 for all x > 0, but the convergence is not uniform on [0, oo). 3. Prove that xe~nx-------->0 uniformly on [0, oo). [Hint: Find the maximum value of xe~nx.] 4. Determine whether x2e~nx-------->0 uniformly on [0, oo ). 5. Prove that (sin nx)/n2 is continuous on R. 6. Prove that 2 1/ (1+ 3n) is continuous for x > 1. 00 7. Prove that f ( x ) = ^ e~nx sin nx is continuous for x > 0. n =1 8. (cont.) J ustify the formula / i t . n =1 f ( x) dx= y I e~nx sin nx dx. 9. J ustify the formula 00 00 n ^ cos nx d |~V^sin nxl _ V* < dx (_ n3 J n =1 10*. In Theorem 8.3, suppose the hypothesis un(x) -------->u(x) for each x in (a, b) is replaced by {un{c)) converges for some c in (a, 6). Prove that {un(x)} converges for each x follows anyhow. 11. If 2 an is an absolutely convergent series, prove that ^ an sin nx converges uniformly. 12. Suppose fn(x) -------->F (x) uniformly on [a, 6]. Does it follow that/n' (x) --------> F' (x) on [a, 6]? 4 . S o lid A n a lyti c G e o m e try 1. COORDINATES AND VECTORS I n this chapter we shall develop tools useful for geometric applications of calculus and for the study of functions of several variables. Although we shall work in the euclidean three-space R3, we note that most of what we do applies as well to the euclidean plane R2. First we introduce a rectangular coordinate system in R3. We select an origin 0 and three mutually perpendicular real axes through the origin (Fig. 1.1a). F i g . 1.1 Once directions are fixed on the x- and y-axes, the direction on the 2-axis is determined by the right-hand rule: curl the fingers of your right hand from the positive ^-direction towards the positive ^/-direction; the thumb will point in the positive ^-direction (Fig. 1.1b). (In drawings of 3-space, it is convenient to think of the #-axis as pointing straight up from the paper). We refer to the plane of the #-axis and y-axis as the x, 2/-cordinate plane or simply the x, y-plane, etc. (Fig. 1.2a). Now take any point x in space. Pass planes through x parallel to each coordinate plane. Their intersections with the coordinate axes determine three 1. Coordinates and Vectors 117 (a) the coordinate planes (b) coordinates of a point Fig 1.2 numbers x, y, z, called the coordinates of x. See Fig. 1.2b. Conversely, each triple (x, y, z) of real numbers determines a unique point x in space. We shall write x = (x, y, z). A point (.x, y, z) is located by marking its projection (x, y, 0) in the Xj 2/-plane and going up or down the corresponding amount z. (From the habit of living in the x, y-plane for so long, we think of the ^-direction as up.) See Fig. 1.3a for some examples. (a) locating points (b) dashes for hidden lines Fi g . 1.3 118 4. SOLID ANALYTIC GEOMETRY The portion of space where x, y , and z are positive is called the first octant. (No one numbers the other seven octants.) Sometimes part of a figure which is not in the first octant is shown; dotted lines indicate it is behind the co ordinate planes (Fig. 1.3b). The angle at which the #-axis is drawn in the y , 2-plane is up to you. Choose it so that your drawing is as uncluttered as possible. Actually it is perfectly alright to take a projection into other than the y, 2-plane, so that the y- and 2-axes are not drawn perpendicular. Vectors We now introduce the concept of a vector, then vector algebra and, later, vector analysis. Vectors are most useful for handling problems in space because (1) equations in vector form are independent of choice of coordinate axes, hence are well suited to describe physical situations; (2) each vector equation replaces three ordinary equations; and (3) several frequently occurring procedures can be summarized neatly in vector form. Let the origin 0 be fixed once and for all. A vector in space is a directed line segment that begins at 0; it is completely determined by its terminal point. Denote vectors by bold-faced letters x, v, F, r, etc. (In written work use x or x instead of x.) A point (x, y, z) in space is often identified with the vector x from the origin to the point. A vector is determined by two quantities, length (or magnitude) and direction. Many physical quantities are vectors: force, velocity, acceleration, electric field intensity, etc. Remember that the origin 0 is fixed, and that each vector starts at 0. We often draw vectors starting at other points, but in computations they all originate at 0. For example, if a force F is applied at a point x, we may draw Fig. 1.4a because it is suggestive. But the correct figure is Fig. 1.4b. One must specify both the force vector F (magnitude and direction) and its point of application x. Fi g . 1.4 drawing vectors x (a) picturesque (b) correct With respect to coordinate axes, each vector x has three components (coordinates) x, y, and 2, which we indicate by the notation X = (*, y, z). 1. Coordinates and Vectors 119 See Fig. 1.5. Sometimes it is convenient to index the components, writing x = (#i, x2, 3) instead of x = (x, y) z). The zero vector (origin) will be written 0 = (0, 0, 0). For this vector only, direction is undefined. F i g . 1.5 components of a vector Addition of Vectors The sum u + v of two vectors is defined by the parallelogram law (Fig. 1.6). The points 0, u, v, u + v are vertices of a parallelogram with u + v opposite to 0. F i g . 1.6 parallelogram law of vector addition Vectors are added numerically by adding their components: (uh u2, u3) + (vh v2, Vs) = (Ui + Vi, U2 + V2, u3 + v3). For example, (- 1,3,2) + (1,1,4) = (0,4,6), (0,0,1) + (- 1,0,1) = (-1,0,2). 120 4. SOLID ANALYTIC GEOMETRY Let us prove that the sum of vectors, defined geometrically by the parallelo gram law, can be computed algebraically by adding corresponding components. We pass planes P, Q, R through u, v, and w = u + v parallel to the #3, #i-plane (Fig. 1.7). They meet the #2-axis at u2, v2, and w2. Because vw and Ou are parallel, the directed distance from Q to R equals the directed distance from the xs, #i-plane to P. Hence w2 v2= u2, that is, w2 = u2 + v2. Similarly Wi = ui + V\ and Wz = Uz + vz. Fi g . 1.7 proof of componentwise addition Multiplication by a Scalar Let v be a vector and let a be a number (scalar). We define the product av to be the vector whose length is |a| times the length of v and which points in the same direction as v if a > 0, in the.opposite direction if a < 0. If a = 0, then av = 0. The physical idea behind this definition is simple. I f a particle moving in a 2v Fig. 1.8 scalar multiples v 1. Coordinates and Vectors 121 certain direction doubles its speed, its velocity vector is doubled; if a horse pulling a cart in a certain direction triples its effort, the force vector triples. Figure 1.8 illustrates multiples of a vector. Scalar multiples are computed in components by the following rule. a(vh v2, vs) = (avh av2, avz) . This rule is proved by similar triangles (Fig. 1.9). The triangle 0y2v is similar to 0w2w, hence w2 = av2, etc. Fi g . 1.9 proof of a(vi, v2, t>3) = (avi, av2, avz) difference v w of two vectors is defined by V W = V + ( w). The v 0 (a) w (b) (c) Fig. 1.10 difference of vectors v w w 122 4. SOLID ANALYTIC GEOMETRY See Fig. 1.10. (The vector w has the same length as w but points in the opposite direction.) The segment from the tip of w to the tip of v (the dashed line in Fig. 1.10c) has the same length and direction as v w. Hence if two points are represented by vectors v and w, the distance between them is the length of v w. The basic rules of vector algebra follow directly from the coordinate formulas for addition and multiplication by a scalar. Rules of Vector Algebra v + 0 = 0 + v = v v + ( - V ) = ( v) + v = 0 u + v = v + u U + (v + w) = (u + v) + w Ov = 0 l v = v a (bw) = (ab)v (a + J>)v = av + 6v a(v + w) = av + aw EXERCISES Draw axes as in Fig. 1.1a and locate each point accurately: 1. (1,2,3), (1,3,4)* 2. (2,4,1), (2,-4, 1) 3. (-1, 2, 1), (2, 2, - 1) 4. (1, - 3, 3), (3, 2, - 2) 5. (4, 6, -1), (4, -6, 1) 6. (0, 0, -3), (-2, - 5, - 3). Draw the parallelepiped with edges parallel to the axes and locate the vertices. The ends of a diagonal are: 7. (0,0,0), (2,3,1) 8. (4,2,3), (1,1,1). Are the points collinear? 9. (0, 0, 0), (1, 3, 2), (2, 6, 4) 10. (0, 0, 0), (-1, 3, -4), (2, - 5, 8) 11. (1,1,1), (0,1,2), ( - 1,- 3,- 5) 12. (1,-1,-2), (-1,2,3), (3,-4,- 7). Compute: 13. (1, 2, - 3) + (4, 0, 7) 14. (-1, -1, 0) + (3, 5, 2) 15. (4, 0, 7) - (1, 2, - 3) 16. (2, 1, 1) - (3, -1, - 2) 17. (1, 2, 3) - 6(0, 3, - 1) 18. 4[(1, - 2, - 7) - (1, 1, 1)] 19. 3(1, 4, 2) - 2(2, 1, 1) 20. 4(1, -1, 2) - 3(1, - 1, 2). Prove: 21. u + v = v + u 22. u + (v + w ) = (u + v ) + w 23. (a + 6 )v = a v + 6 v 24. a (v + w ) = a v + a w. 25. Show that \ (v + w ) is the midpoint of the segment from v to w. [Hint: Use the parallelogram law.] 26. (cont.) Use Ex. 25 to show that the segments joining the midpoints of opposite sides of a (skew) quadrilateral bisect each other. [Hint: (u + v ) + (w + z ) = (v + w ) + (u + z ). ] 27. Show that J (u + v + w ) is the intersection of the medians of the triangle with vertices u, v, and w. 2. Length and Dot Product 123 28*. In a tetrahedron, prove that the four lines joining each vertex to the centroid (intersection of the medians) of the opposite face are concurrent. 29*. Space billiardsno gravity. The astronaut cues a ball toward the corner of a rectangular room, with velocity v. The ball misses the corner, but rebounds off of each of the three adjacent walls. Find its returning velocity vector. 2. LENGTH AND DOT PRODUCT The length |v| of a vector v = (vh v2, v3) is its distance from the origin. By regarding this distance as the length of the diagonal of a rectangular solid (Fig. 2.1a), we see that V |2 = VI2 + V22 + Vz2. (a) length of a vector: (b) distance from v to w ______ = |v - w| iv|* =( v v + = l>i2 + Vi1 + v2 Fig. 2.1length and distance Vector length has the following properties: | 0| = 0, |v| > 0 if v 5* 0, |av| = |o|*|v|, iv + w| < JvJ + |w| (triangle inequality). The distance between two points v and w is equal to |v w|. See Fig. 2.1b. If v = (vh v2, Vz) and w = (wi, wz), then v w = (vi wh v2 w2) Vz Wz)- Therefore we have the distance formula: 124 4. SOLID ANALYTIC GEOMETRY Distance Formula The distance between two points v and w is |v w|, where |v w|2 = (^1 W i ) 2 + (v2 w2)2 + (Vz Wz ) 2. Dot Product There is another important vector operation, the inner product or dot product of two vectors. Let v and w be vectors, and let 0be the angle between them (Fig. 2.2a). Define Y-W = |V| |w| COS0. Since cos(0) = cos 0, you can measure 0from v to w or from w to v. Note (Fig. 2.2b, c) that |w| cos 0is the (signed) projection of w on v, hence v w is |v| times the projection of w on v. If v = 0 or w = 0, we define v w = 0 even though 0is not defined. (a) (b) (c) Fig. 2.2dot product Impo r t a n t : The dot product of two vectors is a scalar (number), not a vector. The numerical rule for computing dot products is V W = V1W1 + V2W2 + VzWz, an important formula. Let us prove it. See Fig. 2.3. By the Law of Cosines, Hence V#W= \ [jV^ ~ lv ~ wl2] = 2 [l (Vl>V2 V z I (Wl9 W2>W3^2 I Wh V* 2, ^3- |2J = ^ (^l2 + V22 + VS2 ) + ( Wi 2 + W22 + Ws2) (Vi Wi ) 2 (v2 W2) 2 (v3 * ) ] = ViWi + V2W2 + VsWs. 2. Length and Dot Product 125 The main algebraic properties of the inner product follow easily from the formula v*w = ViWi + v2w2 + VsWs'. v w = w*v (av)*w = v (aw) = a ( v w ) (u + v) *w = u*w + v w . Two vectors v and w are perpendicular if 0 = w/ 2, i.e., if cos 0 = 0 . This can be expressed very neatly as follows: The condition for vectors v and w to be perpendicular is v w = 0. (The vector 0 is considered perpendicular to every vector.) For example, (1, 2, 3) and ( 1, 1, 1) are perpendicular because (1, 2, 3 ) . ( 1, - 1 , 1) = - 1 - 2 + 3 = 0. 126 4. SOLID ANALYTIC GEOMETRY There is a connection between lengths and dot products. The dot product of a vector v with itself is v v = Ivl2cos 0 = Ivl2. For any vector v, VV = |v|2 = Vi2 + V22 + Vz2. From the dot product can be found the angle 0between any two non-zero vectors v and w. Indeed, cos 0 = V* w Ivl |w| E XAM P L E 2.1 Find the angle between v = (1, 2, 1) and w = (3, 1, 1). Solution: v w = 3 2 + 1 = 2, | v |2= 1 + 4 + 1 = 6, | w|2 = 9 + 1 + 1 = 11. Hence cost/ = v W n Answer: arc cos \/66 E XAM P L E 2.2 The point ( 1, 1, 2) is joined to the points (1, 1, 1) and (3, 0, 4) by lines L\ and L2. What is the angle 0 between these lines? Solution: The vector v = (1, - 1, - 1) - (1,1,2) = (0, 2, 3) is parallel to Li (but starts at 0). Likewise w = ( 3 , 0 , 4 ) - (1, 1,2)= (2, - 1,2) is parallel to L2. Hence v w 0 + 2 6 4 cost/ = |vj |w| \/0 + 4 + 9\/4 +1 +4 V l 3\/9 ' No t e : When we find cos#< 0 for an angle 6 between two lines, then 2. Length and Dot Product 127 0is not the smaller angle between the lines, but its supplement. The basic fact here is that c o s (t 0) = cos 0. Direction Cosines I t is customary to use the notation i = (1,0,0), j = (0,1,0), k = (0,0,1) for the three unit-length vectors along the positive coordinate axes (Fig. 2.4). If v is any vector, then v = (vh v2, vz) = Vi (l , 0, 0) + 02 (0, 1, 0) + f73(0, 0, 1) = Vi\ + v2] + ^3k. Fi g . 2.4 the basic unit vectors Thus v is the sum of three vectors yj, v2\ ) vzVi which lie along the three co ordinate axes. The components vh v2, Vs can be interpreted as dot products: v-i = (yh v2) Vs) (1, 0, 0 ) = vi. Similarly, v2 = v j and vs = v*k. Now suppose u is a unit vector, i.e., a vector of length one (Fig. 2.5a). Let a be the angle from i to u. Define and y similarly. Then u*i = cos a, u j = cos 0, and u k = cos 7. Hence u = cos a i + cos /3 j + cos 7 k = (cos a, cos 13 cos 7 ). Since |u| = 1, cos2a + cos2p + cos27 = 1. Unit vectors are direction indicators. Any non-zero vector v is a positive multiple of a unit vector u in the same direction as v. I n fact v = |v| u, so 128 4. SOLID ANALYTIC GEOMETRY k Fi g . 2.5 direction cosines Each non-zero vector v can be expressed as v = |v| u, u a unit vector, or as v = |v| (cos a, COS 13, cos 7 ). The numbers cos a, cos /3, cos 7 are called the direction cosines of v. They satisfy cos2 a + cos213 + cos2 7 = 1. If u is a unit vector (Fig. 2.5b) in the plane of i and j, then u = cos a i + cos /3j. Since u is a unit vector, cos2 a + cos2 /3 = 1. But, as is seen in the figure, cos j3 = sin a. Therefore, the preceding equation simply says cos2 a + sin2 a = 1. Summary A d d i t i o n o f V e c t o r s (Vi, v2, Vs) + (tt>i, W2, Ws ) = (^1+ Wi , v 2 + w 2, Vs + Ws ) . M u l t i p l i c a t i o n b y a S c a l a r : a ( v i , v2) Vs) = ( a v h a v 2, a v s ) . L e n g t h : | V |2 = V V = V12 + V22 + Vs2. 2. Length and Dot Product 129 v* w = | v| | w| cos 6 = ViWi + V2W2+ VsWs. V e c t o r s i, j, k : These are unit vectors in the direction of the positive #-axis, ?/-axis, 2-axis, respectively. I f v = (vh v2, Vs), then v = v{\ + v2\ +v3k> where Vi = v i , v2 = v j , = v k . D i r e c t i o n C o s i n e s : I f u is a unit vector, then u = cos a i + cos @j + cos 7 k, where a, 0, 7 are the angles to u from i, j, k, respectively. Furthermore cos2 a + cos2 P + cos2 7 = 1. Any non-zero vector v can be written asv = | v| u = | v| (cos a, cos 0, cos 7 ). The numbers cos a, cos ft, cos 7 are the direction cosines of v. D o t Pr o d u c t : EXERCISES Compute: 1. (8, 2, 1). (3, 0, 5) 2. (-1, -1, - 1). (1, 2, 3) 3. (1,0, 2).[(1,4, 1)+ (2,0, - 3)] 4. 1(2,-4,7)1 5. |3l - j + k| 6. |(1, 1, 0) (3, 5, 2)| 7. | W 3 (-1 , 1 , 1 )| 8. [3 j - (1 , 1 , 2 )3 (4 j - k ). Find the angle between the vectors: 9. (4, 3, 0), (3, 0, 4) 10. (1, 2, 2), (-2, 1, - 2) 11. (6, 1, 5), (-2, - 3, 3) 12. (-5, 6, 1), (2, 3, - 8) 13. (1, 1,-1), (2, 0, 4) 14. (2, 2, 2), (-2, 2, - 2). Compute the distance between the points: 15. (0, 1, 2), (5, -3, 1) 16. (1, 1, 1), (1, - 1, 2) 17. (7, 0, 0), (2, 3, 4) 18. (8, 5,-1), (7, 9, 3). Find the direction cosines: 19. (1,0,1) 20. ( - 1,- 1,- 1) 21. (2, 1,- 3) 22. (4,- 7,- 4). 23. Find two non-collinear vectors perpendicular to (1, 1, 2). 24. Find the angle between the line joining (0, 0, 0) to (1, 1, 1) and the line joining (1, 0, 0) to (0, 1, 0). 25. Prove the Cauchy-Schwarz inequality: I V w| < | v| Iw| . When does equality hold? 26. (cont.) Now prove the triangle inequality: | v + w| < | v| + | w| . 130 4. SOLID ANALYTIC GEOMETRY [Hint: | v + w| 2 = | (v + w ) (v + w)| = I (v + w ) ' V + (v + w)w| < | v + w| -1v| + | v + w| '| w| . ] 27. Prove the identity | v + w| 2 | v w| 2 = 4 v w . 28. Let u be a unit vector. Show that the formula v = (v u )u [v (v u )u ] expresses v as the sum of two vectors, one parallel to u , the other perpendicular to u , and is the only such expression. 29. Let u = (cos ai, cos a2, cos ck3) and v = (cos ft, cos ft, cos ft) be two unit vectors. Show that the angle 6 between them satisfies cos 6 = cos ai cos ft + cos a2 cos ft + cos <23 cos ft. Interpret the formula when a3= ft = ^7r. 3. LINES AND PLANES Take a non-zero vector v. The set of all scalar multiples x = tv of v is a line through the origin (Fig. 3.1a). If x0is any point, then the set of all points x = x0 + v is a line through x0parallel to the first line (Fig. 3.1b). The equa tion x = x0 + tv is called a parametric vector equation for the line. For example (jci, x2, xs) = (0, 1, - 2) + t ( 2, 1, 3) is a parametric equation for the line through (0, 1, 2) parallel to (2, 1,3). This vector equation is equivalent to three parametric scalar equations: xi = 2 1, x2 = t + 1, X3 = 3t 2 . Fi g . 3.1 (a) line through 0: X = tv (b) line through x0 parallel to v 3. Lines and Planes 131 Given a point x0 of R3, and a non-zero vector v, the line through x0 parallel to v consists of all points X = Xo + Jv, where oo < t < o o . E XAM P L E 3.1 Find a parametric vector equation for the line through (3, 1, 2) and (4, 1, 1). Solution: The line passes through x0 = (3, 1, 2) and is parallel to v = (4 , 1 , 1 ) - (3, - 1,2) = (1,2, 1). Hence x = (3, - 1,2) +*(1,2, - 1) is a parametric form for the line. Answer: x *=(3, 1,2) +$(1,2, 1)
= (3 + t, - 1 + 2t, 2 - t).
Equation of a Plane
Let P be a plane in R 3. Draw the line L through 0 perpendicular to P and
take one of the two unit vectors along L; call it n. See Fig. 3.2. The line L
meets P in a point p n . I f x is any point of P, then x pn is perpendicular
to n , therefore
(x pn) *n = 0 , x* n = p n * n = p.
L Fig. 3.2 normal form of a plane
N o rm a l Fo rm Each plane in R 3 can be represented by an equation
x n = p ,
where n is a unit vector perpendicular to the plane and p is a real number.
Conversely, each equation x* n = p represents a plane, the one through
the point pn and perpendicular to n.
132 4. SOLID ANALYTIC GEOMETRY
I t follows that each linear equation in 3 variables,
aiXi + a2x2+ azx3= b (ai2 + a22+ a33^ 0),
is the equation of a plane. J ust set a = (ah a2, a3) and n = a/| a|. Then the
equation, in vector notation, is
b
a * x = o, or n * x = = py
|a|
the normal form of a plane.
R e m a r k 1: There really are two normal forms of a plane,
x* n = p, x* ( n ) = p.
R e m a r k 2: I n applications it is often more convenient if n is not a unit
vector. Certainly it is simpler to write
Oi, X2, x3) (1, 1, 1) =5
than the equivalent normal form
(xh x2, x3) - i V S ( l , 1, 1) = i h/3-
EXERCISES
Express in normal form:
1. x\ 2x2 + 2z3 = 1 2. 2x\ + 6^2 3z3 = 14
3. &Ei + X2 4z3 = 27 4. Sxi 2x2 6x3 = 4
5. x\ -j- x2 x33 6. x\ X2 ~\~x312.
7. Find a parametric form for the line through two distinct points x0 and xi .
8. Let x = x0 + tu and x = xi + be two parallel lines. Prove that v = cu for
some c.
9. Let x* n = p be a plane in normal form and x0 a point. Prove that the distance from
x0 to the plane is | x0n p\.
10. (cont.) Prove that the point of the plane closest to x0 is z = x0 + (p x0* n )n .
11. Let x = x0 + be a line in parametric form and let x* n = p be a plane in normal
form. Prove that the line and plane are parallel if and only if v n = 0.
12. (cont.) Prove that the line is on the plane if and only if v -n = 0 and x0* n = p.
13. (cont.) Suppose v* n ^ 0. Prove that the point of intersection of the line and the
plane is
Xo* +[( p x0-n )/ (v n )]v .
14. Let x m = p and x* n = q be two non-parallel planes in normal form. Let 8 be one
of their (dihedral) angles of intersection. Determine cos0.
15. Let x = x0 + u be a line in parametric form, where u is a unit vector, and let yo
be a point. Find the point on the line closest to y0.
16. (cont.) Prove that the distance D from yo to the line satisfies D2 = | xo yo| 2
[(x 0 - yo )* u ]2.
4. Linear Systems and Intersections 133
Suppose we are given three planes
arx = di, a2*x = d2) a3*x = d3.
I n general, they will intersect in a single point x. How can we find this point? I n
coordinates, the problem is to solve simultaneously three linear equations
dix + biy + c\z = di
<a2x + b2y + c2z = d2
a3x + b3y + c3z = d3
for x, y, z, where ah , d3 are given constants.
This is typical of problems in a subject called Linear Algebra, which we
shall take up in Chapter 7. While it is beyond the scope of this book to go
into the general theory, we shall give a practical method of solution.
E XAM P L E 4.1
Find the solution (%, y, z) of the system
a
?
i
+
I
I
-
3y + 2z = - 1
z = 3 .
Solution: By the third equation, z = 3. Substitute this into the first
two equations. The result is a new system of two equations for x and y:
2x - y = 4 - (- 3) = 7
32/ = 1 2( 3) = 5.
By the second equation, y = f . Substitute this into the first equation; the
result is a single equation for x :
2x = 7 + =
I ts solution is x = -V3--
Answer: (^, f, 3).
This example was very easy because we could solve for the unknowns
one at a time. To solve a more general system, we reduce it to a system of this
type by eliminating the unknowns one by one.
134 4. SOLID ANALYTIC GEOMETRY
E XAM P L E 4.2
Find the solution of the system
I
i
+
1
c
l
*<
2x + 2y + 32 = 3
Ox 9y 2z 17.
Solution: Eliminate x from the second and third equations. Subtract
the first equation from the second, and subtract 3 times the first equation
from the third; the result is an equivalent system of three equations (the first
the same as before):
2 x y + z = 4
Sy + 2z = 1
6y oz = 5.
Now eliminate y from the third equation. Add twice the second equation to
the third, but keep the first two equations:
2 x y + z = 4
3y + 2z = -1
- 2 = 3.
This is the system in Example 4.1, which we can solve by elimination.
Answer: x = ^ y = f , z = 3.
The method of elimination works for a very simple reason: adding a con
stant multiple of one equation to another equation does not affect the solution
of the system. To verify this, suppose a r x = di and a2*x = d2 are two linear
equations. If c is any number, then the two systems
a r x = di ( a r x = di
and \
a 2* x = d2 [(a 2 +ca i ) * x = d2 + cdi
have exactly the same solutions. For if x satisfies a r x = d\ and a 2* x = d2, then
(a 2 + ca i )* x = a 2* x +c (a rx ) = d2 + cd\.
Conversely, if x satisfies a rx = di and (a 2 + ca i )* x = d2 + cdi, then
a 2* x = (a 2 +ca i )* x c (a rx ) = (d2 +cdi) cdi = d2.
Thus the two systems are equivalent; they have precisely the same solutions.
4. Linear Systems and Intersections 135
not have to eliminate first x and then y. Eliminate any two of the unknowns
in an order that makes the computation easiest.
E XAM P L E 4.3
Solve the system
3a; + y 2z = 4
-<
- 5 x + 2z = 5
k7x y + Sz = 2.
Solution: Since y is missing from the second equation, add the first to
the third; then y is eliminated from two equations:
Sx + y 2z = 4
<- 5 x + 2z = 5
4# + 2 = 2.
Now add 2 times the third to the second; this eliminates z :
Sx + y 2z = 4
< 3* = 1
^4x + 2 = 2.
By the second equation, x = }. By the third equation, z = 2 + 4x = 2 +
4(A) = V- Finally,
y = 4 - 3* + 22 = 4 - 1+ =
Answer: $= f, y = 2= ^ Certain systems of equations do not have precisely one solution. There may be either no solutions at all, or more than one solution. I n the latter case it turns out that there are infinitely many. Both cases can be handled by elimination. Inconsistent Systems Sometimes the elimination process leads to an equation Ox + Oy + Oz = d, where d ^ 0. Obviously no choice of x, y, and z will satisfy this equation. Then the system simply has no solution, and it is called an inconsistent system. 136 4. SOLID ANALYTIC GEOMETRY An example with two unknowns will suffice to illustrate this. Look at the system x + Sy = - 1 2x + 6y = 3. Add 2 times the first equation to the second. The result is the system x + 3 y = - 1. 0 =5 . There is no solution, for if there were, then 0 = 5 would be a correct state ment. Underdetermined Systems Some systems have more than one solution. This happens when one of the equations is a consequence of the other two, so that really there are only two (or fewer) equations. Such systems are called underdetermined. First consider an example in two unknowns : x + y = 1 2x + 2y = 2. The second equation is obviously twice the first. If we try to eliminate x by adding 2 times the first to the second, the result is x + y = 1 0 = 0. The second equation in this equivalent system gives no information whatever. Any solution of the first equation is a solution of the system. Thus there are infinitely many: each point (x, y) on the line x + y = 1is a solution. I t is convenient to express these solutions in parametric form. A parametric representation of the line is (.x, y) = (t, 1 t ), where <*>< t < oo. Hence, the solution of the system is x = t, y = 1 t, oo < t < oo. Now consider the example ' x + y + z = 1 < x 2y + 2z = 4 2x y + 3z = 5. 4. Linear Systems and Intersections 137 Eliminate x from the second and third equations: ' x + y + z = 1 < - 3 y + z = 3 3y + z = 3. The last two equations both say the same thing. Therefore the system is equivalent to the system x + y + z = 1 3y z = 3. and no further elimination is possible. To get a parametric solution, set y = t. Then z = 3 + 3y = 3 + 3t, and x = 1 - y - z = 1 - t - (3 + 30 = - 2 - 4t. The most general solution is (x, I/, 2) = ( - 2 - 4, <, 3 + 30, where 00 < t < <*>. The set of solutions is a line, (x, y, z) = ( 2, 0, 3) + t ( - 4, 1,3). Finally, consider the system " z + ?/ + z = 1 2z + 2y + 2z = 2 3x ~b3y -|- 3z = 3. This system is obviously equivalent to the single equation x + y + z = 1. Therefore the set of solutions (x, y, z) is a plane in space. For a parametric solution, set x = s and y = t. Then z = 1 s t. The general solution is (x, y , z) = 0, Z, 1 - s - t). Intersections of Planes Now we can close the books on intersection of planes. I f arx = di, a2*x = d2, a3*x = d3 are three planes in space, we find the points common to the three planes by the elimination method. Generally, there is a single com mon point (Fig. 4.1a). However, if the planes are parallel or if one is parallel to the intersection of the other two, then there is no common point (Fig. 4.1c). I n this case, the corresponding system of equations is inconsistent. For 1 38 4. SOLID ANALYTIC GEOMETRY * example, the system x + y + z = 1 - x + y + z = 2 3#2y + 42 = 7 is obviously inconsistent; the first two equations cannot both be satisfied. Geometrically, the first two planes are parallel. parallel lines (c) no intersection Fi g . 4.1 possible intersections of three distinct planes The planes have more than one common point if they pass through a common line, or if two or all of them coincide. I n this case, the corresponding system of equations is underdetermined. For example the system x 2y + Sz = 5 <8x + 7y + z = 2 2x 4y + ftz = 10 is underdetermined; the third equation is twice the first. Geometrically, the first and third planes coincide. The system represents two distinct planes that have a line in common (Fig. 4.1b). 4. Linear Systems and Intersections 139 Review of Determinants An important computational tool in solving linear systems is the use of determinants. We recall from elementary algebra the definitions of deter minants of orders two and three: ai b\ CL2 ^2 d l 61 Cl d2 b 2 C 2 dz b 3 Ci &1&2C3 ~t~d2bzCi I" CLsblC2 CLlbsCz &2&1C3 0^362^1. From the defining formulas: (1) if two rows (columns) are equal, the deter minant is zero; (2) if two rows (columns) are transposed, the determinant changes sign; (3) if a multiple of one row (column) is added to another row (column), the determinant is unchanged; and (4) if all the terms in one row (column) are multiplied by a scalar, the determinant is multiplied by the same scalar. Also the defining formulas imply various expansions by minors of a row (column), for instance d\ b\ Ci 62 C2 t o C i t o d 2 62 d2 62 C2 = d l - h + Cl bz Cz dz Cz dz bz dz bz Cz is the expansion by minors of the first row\ Here, for instance, d2 C2 dz Cz is the minor of bi. I t is the 2X 2 determinant remaining after the row and the column containing bi are crossed off. A system of equations dix + biy + C\z = d\ d2X + b2y + C2Z = d2 , + bsy + c3z = dz is both consistent (not inconsistent) and determined (not underdetermined) if and only if the system determinant D ^O, where 140 4. SOLID ANALYTIC GEOMETRY D = ai bi ci ci2 62 C2 dz bz Cz When this is so, the system has a unique solution, given explicitly by Cramers Rule: X ~ D Cramers Rule will be derived in Chapter 7, Section 7. di Cl ai di Cl dl b 1 di C?2 62 c2 1 * - 5 &2 d/2 c2 - I Q 1 1 M d2 d2 ^3 63 Cz 3 dz C3 dz 63 dz EXERCISES Solve by elimination: \x + 2y = 1 1. 3. 5. 7. 9. 11. Sy = 2 x + 2y = 1 x + 3y = 2 2x 3y = 1 Sx + by = 2 x + y + 2 = 0 2y 3z = 1 Sy + 5z = 2 + 2/ 2= 0 x y + z = 0 z + y + z = 0 2a: y 3z = 1 x 4?/ 2z = 1 3x y 2=1 2. 4. 6. 8. 1 0 . 12 2x =3 x + y = 0 x- ^y = a x y = b ' 2 x - 3 y = - I 3x + 62/ = 2 2x y 2 = 1 2y 3z = 1 32/ h 52= 2 2z + y + 32= 1 x + 4i/ + 2z = 0 3z+ y + 2= 1 4z + 2y 2 = 0 z + 32/ + 22= 0 k x + y + 32= 4. 5. Cross Product 141 Show that the system is inconsistent and interpret geometrically: [x = 2 13. 15. Find all solutions of each underdetermined system: x + y = 0 17. 19. 18. 20. x + y = 0 11* + 10y + 92= 5 x + 2y + 3z = 1 3x + 2 y + 2=1. 21*. Suppose a 5^ b, b 5* c, c a. Show that the system x + y + 2= d\ ax + by + cz = di a?x + 62?/ + c22= d 3 has a unique solution. 22*. Suppose the three planes a^x = 0, a2*x = 0, a3*x = 0 have only 0in common. Show that the three planes ai*x = di, a2*x = d2, a3*x = d3have exactly one point in common. [Hint: Apply the same elimination process to both systems.] 5. CROSS PRODUCT Geometric Definition Given a pair of vectors v and w, we define a new vector v X w. The cross product of v and w, written v X w, is the vector whose direction is determined by the right-hand rule from the pair v, w, and whose magnitude is the area of the parallelogram based on v and w. See Fig. 5.1. Note that v X w is a vector perpendicular both to v and to w. Note also that if v and w are collinear, then the parallelogram collapses, so v X w = 0. 142 4. SOLID ANALYTIC GEOMETRY I n particular, v X v = 0 for each vector v. I f v and w are interchanged, the thumb reverses direction, hence w X v = v X w. For the basic vectors i, j, k, the cross products are simply i X j = k, j X k = i, k X i = j. Re ma r k : This definition of cross product is motivated by physics. See the discussion of torque on p. 145. Analytic Definition Given a pair of vectors v = Vi\ + v2\ + ^ k and w = wit + w2] + w3k, it seems reasonable to define their cross product using the cross products of the basic vectors i, j, k as follows: v X w = (v!i + v2\ + v3k) X O i i + w2j + w3k ) = viw2i X j + v2wii X i + viwzi X k + vzWik X i + v2w3j X k + v3w2k X j. (We have used i X i = j X j = k X k = 0 .) Now i X j = k = - j X i, etc. Hence, collecting terms, we have V X W = (iV2Wz VsW2)\ + (VSWi ViWs)} + (viw2 v2wi )k. Note that each coefficient on the right is a 2 X 2 determinant. 5. Cross Product 143 C ro ss P ro d u ct Let v = (vi, v2) Vs) and w = (wh w2) Ws ) . Then ( v2 Vs Vs Vi Vi V2 V w2 Ws 1 Ws Wi ) W\ w2 v X w = = ( v 2Ws VsW2, VsWi Vi Ws, Vi W2 V2Wi ) . Here are two numerical examples: (4,3, - 1) X (- 2,2,1) = = (3 + 2, 2 - 4, 8 + 6) = (5, - 2, 14). 3 - 1 - 1 4 4 C O 2 1 > 1 - 2 1 t o 2 (1,0,1) X (0, 1,1) = 0 1 i i 1 0 \ 1 1 1 0 0 1 ) = ( 1, - 1,1). A device for remembering the cross product is a symbolic determinant, to be expanded by the first row: For the moment we shall take the formula for v X w as the definition of cross product. Our problem is to prove that it has the required geometric properties. We begin the proof with a formula interesting in itself: U\ u2 Us u* (v X w) = D( u , v, w ) = Vi V2 Vs Wi W2 Ws 144 4. SOLID ANALYTIC GEOMETRY To prove it, simply expand the determinant D (u, v, w) by the first row: U\ U2 Uz D(u, v, w) = Vi V2 Vz Wi W2 Wz V2 Vz Vi Vz Vi V2 Ui U2 + Uz W2 Wz W\ Wz Wi W2 = u* (v X w). As a consequence of the formula, v (v X w) = Z)(v, v, w) = 0 , because a determinant with two equal rows is zero. Hence v is perpendicular to v X w. Similarly so is w. We have proved (1) v X w is perpendicular to v and to w. This is one of the required geometric conditions. Next, we prove a formula for the length of v X w : |v X w|2= |v|2 |w|2 (vw)2. The left-hand side is ( W 3 V3W2 ) 2 + ( v 3Wi VI Wz ) 2 + ( V\ W2 V2W1 ) 2 and the right-hand side is (Vi 2 + V i + V32 ) ( Wi 2 + W22 + Wz2 ) ( Vi Wi + V2W2 + Vz Wz ) 2. The first product in the right-hand side yields nine terms and the second product, six terms. The three terms like Vi2Wi2 cancel. There remain six terms like V\2w22 and three terms like 2 V1V2W1W2, exactly what occur on the left- hand side. From this last formula we deduce (2) |v X w| is the area of the parallelogram determined by v and w. For let 0be the angle between v and w. Then v w = |v| |w| cos 0, hence |v X w|2= |v|2 |w|2 (vw)2= |v|2 |w|2(1 cos20) = |v|2 |w|2sin20, |v X w| = |v| |w[ sin0. This last expression is precisely the required area (Fig. 5.2). Finally we must prove (3) v, w, v X w is a right-handed system (provided v and w are not parallel). I n order to do so, we need some analytic way of deciding whether a given triple u, v, w is a right-handed system or not. 5. Cross Product 145 w Fig. 5.2 | v X w| = area of parallelogram Now observe that u, v, w and v, u, w have opposite orientations, that is, one is a right-handed system and the other is left-handed. By analogy, the determinants D(u, v, w) and Z>(v, u, w) have opposite signs. This suggests that the sign of D (u, v, w) corresponds to the orientation of u, v, w. Since i, j, k is right-handed and D(i, j, k) = 1, we suspect that u, v, w is right-handed if D( u, v, w) > 0. This indeed is the case, but instead of proving it, we shall simply take the determinant criterion as the definition of right-handedness. In view of this definition, we must prove that D (v, w, v X w) > 0. Now Z)(v, w, v X w) = Z)(v, v X w, w) = D ( v X w, v, w) = (v X w)* (v X w) = |v X w|2 > 0. This completes the proof that the analytic and geometric definitions of v X w coincide. We now summarize the main algebraic properties of the cross product. They follow readily from our discussion. v X v = 0, w X v = v Xw, (au + 6v) X w = a(u X w) + 6(v X w), u X (av + 6w) = a(u X v) + 6(u X w), u (v X w) = v (w X u) = w (u X v) = D(u, v, w), v X w = 0 if and only if v and w are collinear. Remark: The quantity u* (v X w) = Z)(u, v, w) is sometimes written [u, v, w] and called the triple scalar product. Torque The original motivation for the cross product of vectors came from physics. Consider this situation. A rigid body is free to turn about the origin. A force F acts at a point x of the body. As a result the body wants to rotate about an axis through 0 perpendicular to the plane of x and F (unless x and F are collinear; then there is no turning). See Fig. 5.3a. As usual, the force vector F is drawn at its point 146 4. SOLID ANALYTIC GEOMETRY of application x. But analytically it starts at 0. See Fig. 5.3b. The positiv axis of rotation is determined by the right-hand rule as applied to the pair x, F in that order: x first, F second. Fig. 5.3 torque due to force F applied at x In physics, one speaks of the torque (at the origin) resulting from the force F applied at x. Roughly speaking, torque is a measure of the tendency of a body to rotate under the action of forces. (Torque will be defined precisely in a moment.) By experiment, if F is tripled in magnitude, the torque is tripled; if x is moved out twice as far along the same line and the same F is applied there, the torque is doubled. Hence the torque is proportional to the length of x and to the length of F. Therefore (Fig. 5.3b) the torque is proportional to the area of the parallelogram determined by x and F. Resolve F into components F11and F-*-, where F11is parallel to x and F-1- is perpendicular to x. See Fig. 5.4a. Only produces torque; the amount of torque is the product |F-*-| |x| of the magnitude of F-*- by the length |x| of the lever arm. But this product is the area of the parallelogram determined by x and F. See Fig. 5.4b. (a) (b) Fi g . 5.4 Magnitude of torque equals area of parallelogram. 6. Applications 147 Therefore the torque about the origin is completely described by the one vector x X F. The length of x X F is the magnitude of the torque. The direc tion of x X F is the positive axis of rotation; with your right thumb along x X F, your fingers curl in the direction of turning. In physics, torque about the origin is defined to be the vector x X F. EXERCISES Find the cross product: 1. ( - 2 , 2 , 1)X (4, 3 , - 1 ) 3. (1, 2, 3) X (3, 2,1) 5. ( - 2 , - 2 , - 2 ) X (1,1,0) 7. (0, 0, 0) X (1, 1, 2) 9. (2, 1, 3) X (2, 2 , - 1 ) 2. (1,0, 1)X (1, 1,0) 4. (3, 1 , - 1 ) X ( 3 , - 1 , - 1 ) 6. ( - 1 , 2 , 2) X ( 3 , - 1 , 2 ) 8. ( 1 , - 1 , 1 ) X ( - l , 1 , - 1 ) 10. (1, 2, 3) X (4, 5, 6). A force F is applied at point x Find its torque about the origin: 1 1 . F = ( -1 , 1 , 1 ), x = (1 0 , 0 , 0 ) 1 3. F - ( -1 , 1 , 1 ), x = (2, 2, 1 ) Prove: 1 5. u - ( v X w ) = v ( w X u ) 1 7. (a v) X w = a (v X w) 12. F = (3, 0, 0), 14. F = ( 2 , - 1 , 5 ) , x = (0, 0, 1) x = ( - 7 , 1, 0). 16. (u + v ) X w = u X w + v X w 18. v X (6 w) = 6 (v X w ). 19*. Prove p* u p V q -u q * v = (p X q ) (u X w). 20. (cont.) Use this result for a new proof of u X v s r|2 (ll v)2. 21. Prove the formula u X (v X w) = (u * w)v (u -v )w. [Hint: By the linearity of each side and by symmetry, reduce to the cases where u, v, and w are chosen from i and j.] 22. (cont.) Prove the formula (a X b ) X (u X v ) = [(a X b )-v ]u - [(a X b )-u > = [(u X v)* a ]b [(u X v)* b ]a . Hence show that the left-hand side is a vector along the line of intersection of the plane of a and b with the plane of u and v. 23. (cont.) Show that (a X b ) X (a X c ) is collinear with a. 24. (cont.) Prove the Jacobi identity u X ( vX w) + v X (w X u) + w X (u X v) = 0. 6. APPLICATIONS Volume Two non-collinear vectors determine a parallelogram. Three non-coplanar vectors determine a parallelepiped (Fig. 6.1a) whose volume is given by the formula V = (area of base) (height). u X v 148 4. SOLID ANALYTIC GEOMETRY (a) parallelepiped determined (b) h = (projection of w on u X v) by three non-coplanar vec tors Fi g . 6.1 volume of a parallelepiped Suppose the base is the parallelogram determined by u and v; its area is |u X v|. Assume temporarily that w lies on the same side of the u, v-plane as u X v. Then the height is the projection of w on u X v. See Fig. 6.1b. There fore V = (projection w on u X v) |u X v| = (u X v) - w = D ( u, v, w). If w is on the other side of the u, v-plane, then V = D( u, v, w). In any case: The volume of the parallelepiped determined by three non-coplanar vectors u, v, w is given by the formula 7 = |Z>(u, v, w) | . Intersection of Two Planes Given two planes x* m = p and x* n = q, how can we find their line of intersection? We must assume the planes are not parallel, that is, m and n are not collinear. Then u = m X n is perpendicular to both m and n, so u is parallel to the line of intersection (Fig. 6.2). If we can find a single point x0 on both planes, then the desired line is x = x0 + tu 6. Applications 149 Fi g . 6.2 Intersection of two planes: m X n is parallel to their line of intersection. To find x 0 on this line of intersection and in the plane of m and n. in parametric form. The figure suggests a point in the plane of m and n, so let x0 = am + bn. Then the conditions x0*m = p and x0*n = q result in a system of linear equations for a and b : am* m + 6m*n = p a m*n + 6n*n = q. The determinant of this system is m m m n % = |m|2 |n|2 (m*n)2 = |m X n|2 > 0. m*n n*n Therefore there is a unique solution. E XAM P L E 6.1 Find the line of intersection of the planes x + y + z = 1 and 2x + y z = 3. Solution: The equations of these planes are x m = p and x n = qy where m = ( 1 , 1 , 1 ) , n = (2, 1, - 1 ) , p = - 1 , q = 3. Set u = m X n = ( - 2 , 3, - 1 ) . This vector is perpendicular to m and n, so it has the direction of the line of intersection. To find a point on the line of intersection, set x0 = am + bn and choose a 150 4. SOLID ANALYTIC GEOMETRY and b so that x0*m = 1 and x0*n = 3. Now |m|2 = 3, m*n = 2, |n|2 = 6, and the equations Xo'm = 1 and x0*n = 3 become 3a -f~ 2b = 1 2a -f- 66 = 3. The solution is a = f, b = t t ; therefore Xo = f (1, 1, 1) + T T (2, 1, 1) = ( t t , TT, f f ) Answer; x = - t r , f f ) + t ( 2, 3, - 1 ) . Homogeneous Equations Suppose we are given three planes through the origin, x*u = 0, x*v = 0, x-w = 0. In general, the planes will only have 0 in common. However, it may happen that they have a line in common, or even coincide. This occurs precisely when the three normal vectors u, v, w lie in the same plane. The situation can be described algebraically: A system of three linear homogeneous equations x * u = 0 , x*v = 0, x*w = 0 has a solution x0 ^ 0 (a non-trivial solution) if and only if u* (v X w) = D(u, v, w) = 0. If x0is any solution, then x0is a solution for each t. It is useful to restate this result in terms of determinants and linear equations. A homogeneous system of linear equations UiXi + U2X2 + u 3x 3 = 0 V1X1 + V2X2 + v3x 3 = 0 W1X1 + W2X2 + w3x 3 = 0 has a solution (xi, x 2) x 3) (0, 0, 0) if and only if U\ U2 u 3 Vi V2 v3 = 0. W\ W2 w3 If (xi, x 2, x 3) is any solution, then (tei, t x2, t x3) is a solution for each t. 6. Applications 151 E XAM P L E 6.2 Find a non-trivial solution of x*u = x* v = x* w = 0, where u = (1, 2, 2), v = (3, 1, 2), and w = (5, 3, 2). Solution: First u (v X w) = D (u, v, w) = 1 - 2 2 3 1 - 2 5 - 3 2 = 0. (Note that w = 2u + v.) Therefore the vectors u, v, w are coplanar (the parallelepiped collapses) so the corresponding perpendicular planes have a line L in common. Certainly L contains the point 0 which is common to all three planes. Furthermore it contains u X v, since this vector starts at 0 and is parallel to L. Therefore L is the set of all multiples t(u X v), provided u X v ^ 0. A similar statement holds for v X w and w X u. Note that u X v = (2, 8, 7), while v X w = ( 4, 16, 14) and w X u = ( 2, 8, 7), so these three vectors really are collinear. Rema r k : Recall that Cramer's Rule (Section 4) guarantees a uni que solution if Z)(u, v, w) ^ 0. Since ( 0 , 0 , 0 ) is obviously a solution to the homogeneous system, it is the only solution when D (u, v, w) ^ 0 . This proves again that for a homogeneous system to have a non-trivial solution, its determinant must be zero. Skew Lines Let x = x0 + su and x = y0 + tv be two lines in R3 that do not intersect and are not parallel, i.e., skew lines. We ask how far apart they are (Fig. (6.3a). The vector u X v i s perpendicular to both lines, so n = (u X v) / | u X v| is a unit vector perpendicular to both lines. From Fig. 6.3b we see that the required distance is the length of the projection of x0 yo on n, that is, I(Xo - yo)*n|. E XAM P L E 6.3 Find the distance between the lines x = ( i -f s, s, 2 + 2s) and x = (1 t, 1 i, 1 t). 152 4. SOLID ANALYTIC GEOMETRY u X v (a) skew lines (b) as seen from a direction that makes the lines appear to be parallel Fi g . 6.3 Solution: In this example u = (1, 1, 2), V = ( - 1 , - 1 , - 1 ) , Xo= ( - 1 , 0 , 2 ) , y0 = ( 1, 1, 1). Therefore u X v = (1, 1, 0) and u X v 1 Finally, n = r ^ n = o V 2 a , - i , o ) . u X V 2 (x0 - y0)* n = ( -2 , -1 , ! ) J \ / 2 (l , -1 , 0 ) = - i \ / 2 . Answer: i \ / 2 . Parametric Form of a Plane Two non-collinear vectors u and v in R 3 determine a unique plane through 0 consisting of all points X = su + tv, where s and t are any real numbers. If x0 is any point of R 3, then adding x0 to each point of the plane displaces it to a parallel plane through x 0. See Fig. 6.4. Given a point x0 of R 3 and two non-collinear vectors u and v, the plane through x 0 parallel to the plane of u and v consists of all points x = x0 + su +tv, where oo < s < oo and oc < t < oo. 6. Applications 153 The variables s and t are called p a ra m e te rs, and a plane presented in this fashion is said to be in p a ra m e tri c fo rm . Example: Let x 0 = ( 1 , 1 , 2 ), u = (1 , 0, 1 ), v = (1 , 1 , 0 ). Clearly neither u nor v is a multiple of the other, so they are not collinear. Then x = Xo + su + tv = ( -1 , 1 , 2 ) + s (l , 0, 1 ) + t ( l , 1 , 0 ) = ( 1 + s + t, 1 + t, 2 + s). In coordinates, x = 1 + s + t, y = 1 + t, z = 2 + s . Given a plane in parametric form, how do we put it in normal form? We have x = x0 + su + tv, where u, v are linearly independent. A vector perpendicular to both u and v (hence to the plane) is u X v. This vector is guaranteed to be non-zero because u and v are not collinear. Set n = (u X v )/ 1u X v| . Then n is a unit vector perpendicular to the plane. Hence x* n = Xo*n + su * n + t v n = x 0* n = p. This is a normal form. In the example above, u X v = (1 , 0 , 1 ) X (1 , 1 , 0 ) = (-1 , 1 , 1 ), n = V 3 ( - 1 , 1, 1), V = Xo-n = I V 3. so a normal form is %\ / 3 ( Xi + X2 + 3) = i \ / S . 154 4. SOLID ANALYTIC GEOMETRY Plane through Three Points Three non-collinear points x0, Xi, x2determine a unique plane. If we want a normal form, we argue that u = Xi x0 and v = x2 x0 are parallel to the plane, hence m = u X v is perpendicular to it. Therefore x*m = x0*m is an equation of the plane. If we want a parametric form, then x = x0 + su + v does the job. Now Xo + su + tV = Xo + s( Xi - Xo) + t ( x 2 Xo) (1 s t)x 0 + SXi + tx2, so we have the alternative symmetric form: The plane through three non-collinear points x0, Xi, x2 consists of all points X = SoXo + SiXi + S2X2 where so, Si, s2take on all real values subject to So + Si + s2 = 1. Here is a physical interpretation. Put masses s0, si, s2 at x0, Xi, x2respectively, where s0 + si + s2 = 1. Then x = s0x0 + SiXi + s2x2 is the center of gravity of the masses. Equilibrium Forces Fi , , F n are applied at points Xi, , x n of a rigid body (Fig. 6.5). Now a rigid body is in equilibrium when both the sum of the forces vanishes and the sum of the turning moments (torques) of the forces about 0 6. Applications 155 vanishes. Thus the conditions for equilibrium are the two vector equations: Fi + F2 + + Fw= 0, xi X Fx + x2 X F2 + + xn X Fn = 0. EXERCISES Find the volume of the parallelepiped determined by 1. (1, 1, 0), (0, 1, 1), (1, 0, 1) 2. (4, - 1 , 0), (3, 0, 2), (1, 1,1). Find the line of intersection in parametric form of: 4. x y + z = 0, X' + y + z = 3 6. x + y = 1, y + z = 1. 3. x + 2y + 3z = 0, y z = 1 5. x 2y z = 3, 2x y + z = 4 Find a non-trivial solution, if it exists, of |2x + 6y = 0 7. [ 3x 9y = 0 5x + 4y + 3z = 0 x + 2y + z = 0 3x + y + z = 0 3x + 2z = 0 11 5x + Qy = 0 x + 5y 2 = 0 4x + 3y + 5z = 0 13. *4* 3y 5z = 0 12z + 9i/ + 152 = 0 Find an equation for the parametric plane: 15. x = (1, s, t) 16. x = (s, s + t, 1 + 0 17. x = (2 + , 1 + s + , s 0 18. x = (3s, 2s t, 1 -j- 2)* Find an equation for the plane through the three points: 19. (a, 0, 0), (0, b, 0), (0, 0, c), dbc 0 20. ( 1, 1, 0), ( 1, 0, 1), (0, 1, 1). 21. Prove a\ bi c\ 2 8. 10. 12. 14. I 2x + 4y = 0 3x y = 0 2x + 2y + 42 = 0 3x 8y 52 = 0 3x y + 22 = 0 3z + 3y + 2z = 0 7z + 5?/ + 122 = 0 x + 2y 3z = 0 6x 9i/ + 122 = 0 2x 3y + 42 = 0 10x + 15 y 20 2 = 0. &2 &2 c2 < (ai2+ a22+ a32)(6i2+ 622+ 632)(d2+ c22+ c32) 0^3 &3 C3 by interpreting the determinant as a volume. / 156 4. SOLID ANALYTIC GEOMETRY 22. A seesaw with unequal arms of lengths a and b is in horizontal equilibrium. Find the relations between weights A and B at the ends and the upward reaction C at the fulcrum. 23. Unit vertical forces act downward at the points pi, , pn of the horizontal x, 2/-plane. A force Facts at another point p of the plane so that the rigid system is in equilibrium. Find Fand p. 24. A force Fis applied at a point x. Its torque about a point p is (x p) X F. Suppose Fi, , Fnare applied at points Xi, , xn of a rigid body and the body is in equilib rium. Show that the sum of the torques about p vanishes. (Here p is any point of space, not just 0.) 25. A couple consists of a pair of opposite forces F and F applied at two different points p and q. Show that the total torque is unchanged if p and q are displaced the same amount, i.e., replaced by p + c and q + c. Interpret this total torque geometrically. 5 . Ve cto r C a lcu lu s 1. VECTOR FUNCTIONS In this chapter we study functions whose values are vectors. For example, the position x of a moving particle at time t, or the gravitational force F on an orbiting satellite at time t are vector functions. To indicate that x is a function of time, we write X = x(<); in components, x(<) = ( x ( t ) , y ( t ) , z ( t ) ) . Thus a vector function is a single expression for three ordinary (scalar) func tions x = x ( t ), y = y ( t ), z = z(t ) . F i g . 1.1 What is the derivative of a vector function? Think of x = x (t) as tracing a path in space (Fig. 1.1). For h small, the difference vector x( t + h) x( t ) represents the secant from x (t) to x (t + h). The difference quotient x( t + h) - x(Q h 158 5. VECTOR CALCULUS represents this (short) secant divided by the small number h. The limit as h -------- 0 is called the derivative of the vector function: . dx x( t + h ) - x ( t ) x ( 0 = = h m ----------------------- dt h The derivative x(t ) is a vector in the direction of the tangent to the curve because the tangent is the limiting position of the secant. To compute the derivative, express all vectors in components: x ( t + h) - x ( 0 _ 1 + + + _ 2(<))] h h = \ { x { t + h) - x( t ) , y( t + h) - y( t ) , z( t + h) - z {t ) ) h x ( t + h ) x ( t ) y( t + h ) y ( t ) z i t + h) - z ( t ) h h h It follows that x( t -f h) - x(<) lim---------- ------------ x( t + h) x( t ) z(t + h ) - z ( t ) lim---------- ----------- , , lim----------- ----- h-*0 The result is = ( l i m \h-*0 / dx dy dz \ = dt J t ) ' dx / dx dy dz \ dt ytt Jdt dt) ) ) dx / dx dy dz \ dt \ dt 1dt dt ) The derivative of a vector function x(f) = (x(t ), y ( t ) , z { t ) ) is the vector function dx dt If a particle moves along a path x( t ) , its velocity is the vector function dx v(<) - a - The magnitude |v(2)| of the velocity is called the speed. It is a scalar (numerical) function. The direction of v() is tangential to the path of motion. 1. Vector Functions 159 E XAM P L E 1 .1 The position of a moving particle at time t is (t, t2, tz). Find its velocity vector and its speed. Solution: Let x( t ) = (t, t2, ts). Then v ( 0 = x ( 0 = (1, 2t, 3t2), | v ( 0 | 2 = 1 + (202 + (3<2)2 = 1 + 4<2 + 9<4. Answer: v(<) = (1, 2t, 3<2), -v/l + 4i2+ 9 i4. E XAM P L E 1 .2 If x ( 0 is a vector function whose derivative is zero, show that x ( 0 = c, a constant vector. Solution: x ( 0 = ( x \ ( t ) y 2( t ) y X s ( t ) ) = ( 0 , 0 , 0 ) . Hence i ( t ) = 0, and so X i ( t ) = d (constant) for i = 1, 2, 3. Therefore x ( 0 = (ci>c2, c3) = c. R e m a r k : Physically, this example simply says that an object with zero velocity is standing still. Differentiation Formulas The following formulas are essential for differentiating vector functions. Each can be verified by differentiating components. | c / (0 x (0 1 = / (0 x (0 +/ (0 x (0 , dt [x(0 + y (<)] = x(<) + y(t ), t, [x(0* y(0J = x(0* y(0 + x( t ) - y( t ) . dt d , (v X w) = v X w + v X w . dt 7 x[ s ( 0 ] = - y (Chain Rule). dt ds dt 160 5. VECTOR CALCULUS To establish the first formula, for example, write Then d_ dt E/(0x(0] =( |c / ( 0*i(0l | c / ( 0**(0l | [ / ( 0*.(0]) = ( f ( t ) xi ( t ) + f ( t ) x 1(t ), f (t )x2(<) + f ( t ) x 2( t ) , f ( t ) x3(t) + f ( t ) x s(t)) = / (0 (*1 (0 ^2 (0 , *3 (< ))+/ (< ) (*1 (0 , *2 (0 , *3 (0 ) =/( 0x(0 +/(<)x(0- E XAM P L E 1 .3 Differentiate t2x{t ) , where x ( 0 = (cos 3, sin 32, 0* Solution: Apply the first formula above: j [ < 2x ( 0 ] = 2 t x(t ) + H ( t ) at = 2(cos 32, sin 3, t) + t2{ 3 sin 3, 3 cos St, 1). Answer: (21cos 32312 sin 32, 22sin 32+ 322cos 32, 322). E XAM P L E 1 .4 Suppose x (0 is a moving uni t vector. Show that x (t) is always per pendicular to its velocity vector v(2). Solution: Verify that x it) v (2) = 0: But xi2 + x22 + X32 = 1 for every t, since x is a unit vector. Hence x*v = 0. Alternate Solution: But by the third differentiation formula on the previous page, x ( 0 * x ( 0 =|x(<)|2=1, |[x(0- x(0] =0. 7 [x(0*x(0] =x(0*x(0 + x(t )'x(t ) =2x (t)'x(t). at Thus x (2) x (2) = 0, that is, x (2) v (2) = 0. 2. Space Curves 161 R e m a r k : This example makes sense geometrically. A moving unit vector represents a particle on the unit sphere |x| = 1. Its velocity vector is tangent to the sphere, i.e., perpendicular to the radius. EXERCISES Differentiate: 1. x(0 = (e, e2*, e3t) 2. x(2) = (t\ 25, t) 3. x(0 = (2+ 1, 3 2 - 1, 41) 4. x(2) = (it2, 0, 23). Find the velocity and the speed: 5. x(2) = (t \ t *+ t \ 1) 6. x(2) = (22- 1, 32+1, - 2 1+ 1) 7. x(2) = (A coscot, A sinco2, Bt) 8. x(t) = (a^ + ^ + 62, ^2 + 63). 9. Suppose that x = x(t) is a moving point such that x(2) is always perpendicular to x(2). Show that x(t) moves on a sphere with center at 0. [Hint: Differentiate |x|2.] d 1 10. Suppose x(t) 7* 0. Show that |x(2)| = ^ x*x. 11. Prove the formula ^ Cx ( 0 + y(0] = x(2) + y(0- at d 12. Prove the formula Cx( 0 #y(0] = x*y + x*y. at 13. Prove the formula ^- (x Xy ) = x X y + x Xy . at 14. Suppose x(t) is a space curve which does not pass through 0, and that x(20) is the point of the curve closest to 0. Show that x(fo)#x(2o) = 0. 15. Suppose that x(t) and y(r) are two space curves which do not intersect. Suppose the distance x(t) y(r) is minimal at t = to and r = to. Show that the vector x(2o) y(r0) is perpendicular to the tangents to the two curves at x(20) and y(r0), respectively. 2. SPACE CURVES In this section we study the arc lengths and the tangents of curves in the plane and in space. To avoid analytic difficulties, we shall always assume that the vector functions under consideration have as many continuous derivatives as are needed. This applies to the following sections also. Length of a Curve Let x = x (2) represent a curve in space. How long is the part of the curve between the points x(2o) and x(2i)? To answer this question, we need a reasonable definition of curve length. Intuitively, the velocity vector x(2) is 162 5. VECTOR CALCULUS directed tangent to the curve (Fig. 2.1), and its length |x()l represents speed, the rate at which distance s along the curve increases with respect to time. v) = x(0 Fi g . 2.1 This leads us to define arc length s by Therefore, J t= | v(OI = | x(Ol> s(<o) = o. ( dsY I /,\|2 \ ( dx dy dz\ I W |X(W -(l )+(I )'+(! ) In terms of differentials, * - M W +(! ) * This formula has a simple geometric interpretation. See Fig. 2.2. The tiny bit of arc length ds corresponds to three displacements dx, dy, and dz along the coordinate axes. By the Distance Formula, (ds)2 = ( dx) 2 + (dy) 2 + (dz)2. Fi g . 2.2 Divide by (dt)2 and take square roots to obtain ds dt This is the time derivative of arc length. Integrate it to obtain the arc length itself. 2. Space Curves 163 Suppose x( t ) describes a curve in space. Let s( t ) denote the length of the curve measured from a fixed initial point. Then y - s / & + J/2 + 22. at The length of the curve from x(t 0) t o x(ti) is L = f \ / x 2 + y2 + z2dt. j <0 For plane curves the formula is slightly simpler because z = 0. Suppose x( t ) = ( x ( t ) , y ( t ) ) describes a plane curve. The length of the curve from x(fo) to x(i) is L = I \ / x 2 + y2dt. J to If the curve is the graph of a function y = / ( x ), then its length from (x0, f ( x 0)) to (xi, f ( x i )) is l = [ X1 v r + T T J XQ )2dx. The last formula is a special case of the preceding one. Indeed, set x = ty y = f (0, where x0 < t < x\. Then x = 1 and y = /, so - = y / W T t f = a A T ? = V I + ( / ' ) 2- The formula for L follows. E XAM P L E 2.1 Find the length of the parabola x( t ) = (t, t2), 0 < t < 1. Solution: This plane curve is a parabola because x = t} y = t2y hence y = x2. Its length is From integral tables, J y / l + 4 t2dt = + - ln (2 + \ / 5 ). Answer: - [ 2 \ / 5 + In(2 + \ / 5 ) l t t 1.479. 164 5. VECTOR CALCULUS E XAM P L E 2.2 Find the length of the curve y sin x for 0 < x < w. Solution: The length L is given by L = J yjl + (^j dx = J y / l + cos2x dx . The exact evaluation of this integral is impossible. It can, however, be ap proximated by Simpson's Rule. Answer: L I y / l + cos2x dx t t 3.820. Jo E XAM P L E 2.3 Find the length of the curve x (t) = (t cos t, t sin t, 2t), 0 < t < 4x. Sketch the curve. 2. Space Curves 165 Solution: Since x2 + y2 = (t cos t ) 2 + {t sin t ) 2 = t2 = - z2. 4 the curve lies on the right circular cone z2 = 4( x2 + y2). As t increases, z steadily increases also, while the projection ( t cos t , t s i n t ) of x( t ) on the x, /-plane traces a spiral. Hence the space curve x = x(t) is a spiral on the surface of the cone (Fig. 2.3). Compute ds/ dt : f d s V = ( d x V f d y V / d z Y \ d t j \ d t j \ d t j \ d t j = (cos t t sin t ) 2 + (sin t + t cos t ) 2 + (2)2 = 5 + t2. Hence = T v s Jo 1 t2dt [ t \ / 5 -|- t2 5 l n(t + \ / 5 h"^2) ] Answer: L = - 4wa + 5 ln(4x + a) where a = y / 5 + 167r2. L ln 5 j , R e m a r k : Suppose the same geometric curve has two different param- etrizations. How do we know that we get the same length? We might have x = x( t ) where to < t < th and x = x ( u ) where the corresponding interval on the w-axis is u0 < u < Ui. We suppose we can obtain either parametriza- tion from the other by a smooth change of variable. Let us take t = t ( u) for the change of variable. We assume t0 = t(uo), ti = t ( ui ) , and dt / du > 0. The /-length and the ^-length of the curve are - r J to dx dt dt and r _ [ U1 \ dx JU " Juo I du du. By the Chain Rule, dx/ du = (dx/ dt ) (dt / du) . The formula for change of variable in a definite integral implies f ui Idx dt 7 f tl \dx u / I 7 du I I Juo \dt du Jto \dt dt = L t. This proves that the length of a curve is a geometric quantity, independent of the analytic representation of the curve. Unit Tangent Vector E XAM P L E 2.4 Plot the locus x(<) = ( t , n 166 5. VECTOR CALCULUS Solution: The locus is the plane curve described by x = t2, y = tz. Hence x3 = y2, y = d=#3/2. The curve is defined only for x > 0. For each positive value of x, there are two values of y. (See Fig. 2.4.) Fi g. 2.4 x ,*3) Remark: The sharp point at the origin is called a cusp. At that point, a particle moving along the curve changes direction abruptly. Note that its velocity at the origin is zero since v = x = (21, 3/2), v(0) = 0. In fact, an abrupt change in direction can occur only when the velocity vector is zero. Physically, this seems plausible; a moving particle cannot change direction suddenly unless it slows down to an instantaneous stop at the corner. To avoid such curves with cusps as shown in Fig. 2.4, we study only curves x( t ) for which the velocity x(t ) never equals zero. Suppose a particle moves along a curve x( t ) . Its velocity vector v( t ) = dx/ dt has length ds/ dt and is directed along the tangent to the curve; hence I where T is a unit vector in the tangential direction. But by the Chain Rule, _ dx _ dx ds dt ds dt Compare these two expressions for v; the result is ds dx ds ^ dt ds dt Therefore dx/ ds = T since ds/ dt ^ 0 is assumed. 2. Space Curves 167 If x( t ) represents a space curve, then is the unit tangent vector to the curve. In terms of the velocity vector v, (It is assumed v ^ O . ) E XAM P L E 2.5 Find the unit tangent vector to the curve x( t ) = (L t2. ts) at the point x ( l ) = (1, 1, 1). Solution: where v =? x = (1, 2t, St2), | v| 2 = 1 + 412 + 91\ Hence T - - j r + w + m (1- 2- W ) - Now substitute t = 1. Answer: T = (1, 2, 3). EXERCISES 1. Find the arc length of x(t) = (ait + bi, a2t + 62) azt + 63) for 0 < t < 1. 2. Find the length of x(t) = (t2, ts) for 0 < t < a. 3. Find the length of x(t) = (t, sin t, cos 0 for 0 < t < 27r. 4. Set up the length of y = x3for 1 < x < 1, but do not evaluate the integral. 5. Set up the length of y = axn for xo < x < a*, but do not evaluate the integral. 6. Set up the length of x() = (tm, 2n, 2r) for 0 < t < 6, but do not evaluate the integral. 7. Find the length of y = z2+ 2z for 1 < x < 1. 8. Carefully plot x() = (2, t* + tb) for t near 0. 9. Find the unit tangent T to the curve x(t) = (t, cos t, sin t) at t = 0. 10. Find the unit tangent T to the curve x(t) = (32 1, 4, 2 + 1) at any point. 11. Find the unit tangent T to the curve x(t) = (a\t + bi, a2t + b2, a3t + 63) at any point. 12. Find the unit tangent T to the curve x (t) = (t cos t, t sin t, 21) at any point. 168 5. VECTOR CALCULUS 3. CURVATURE The curvature of a curve is a quantity which tells how fast the direction of the curve is changing relative to arc length. The magnitude of the rate of change of the unit tangent T with respect to arc length is called the curvature of a curve, and is denoted by k : k = dT ds Since the length of T is constant, T changes in direction only. Thus the curva ture k measures its rate of change of direction. The curvature is a geometric quantity; it does not depend on how the curve is parametrized. E XAM P L E 3.1 A curve has curvature zero. What is the curve? Solution: A natural guess is a straight line. Let is prove this is so. We are given k = 0. Therefore, hence d l d l = 0, = 0. Consequently T is constant, T To (t\, t2, tz). 3. Curvature 169 But dx/ ds = T0, which means dx dy dz T = f = t2, - = U. ds ds ds * Integrating, we have x = a + tis, y = b + t2s, z = c + In vector notation, x = x0 + sT0. But this is the vector equation of the line through x0 parallel to T0. Answer: A straight line. Computation of Curvature The following three formulas are needed to compute curvature. The first two apply to curves given in parametric form, the third to the graph of a func tion. If x = x( t ) is a space curve, then _ [| x|2 lx|2 - (x X)2] 1'2 1*1* If x = ( x( t ) j y { t ) ) is a plane curve, then = xy - yx (x2 + y2y i 2 * If a plane curve is the graph of a function y = f ( x ) , then !/"(*)! /c = [1 + / ' ( ^ ) 2] 3/2 Proof: By the Chain Rule, ds dx ds ^ dt ds dt dt2 ' dt dt ds dx ds _ .. d2s _ ds dT d2s / ds\2dT x = -----= T, x = T H----------- = T + I ) , 7' 7J9 7' dt2 \ d t ) ds Hence ds d2s x* x = , dt dt2 .lo - f dsY w - x ' x = w .. .. / d2s \ 2 , / d s Y Id l 2 / d 2s V / d s V X*X W / ( d t ) | ds ~ ( d l 2) \ d t ) * 170 5. VECTOR CALCULUS Consequently The first formula for k follows. If x = (x(t ), y ( t ) ) is a plane curve, then t |x|2 |x|2 - (x* x)2 = (x2 + y2) (x2 + y2) - (xx + y y ) 2 = (xy - y x ) 2y so the second formula follows. Finally, if the plane curve is the graph of y = / ( # ) , apply the second for mula with t = x and x = ( t , f ( t ) ) = (x, f ( x ) ) . Thenx = 1, x = 0, y = f ' ( x ) , and y = f (x), so the third formula follows by direct substitution. EXAMPLE 3.2 Find the curvature of a circle of radius a. Solution: Let the equation of the circle be x2 + y2 = a2. Thus y = zb a2 x2. (This equation describes either the upper or lower half of the circle depending on whether the positive or negative square root is chosen.) Differentiate: x y Differentiate again: y - xy' y2 y - x ( - x / y ) W x2 + y2 yZ y' Now Hence by the formula for curvature, a2/y* | a2 1 (<a2/ y 2Y 12 a3 a Alternate Solution: Write x( t ) = (a cos t, a sin t). \ (This describes the circle by its central angle t . ) Then / dsX2 x = ( a sin t, a cos t), |x|2 = f J = ( a sin t ) 2 + (a cos t ) 2 = a2, 3. Curvature 171 Hence Differentiate: . ds x = = a. 1 1 dt T = = ( sin cos 0- x ds dT dT / . t ; V = ^7 = (~cos<, - s i n 0- dt ds dt Take lengths, substituting a = ds / dt : = [ ( cos 0 2 + (~sin 0 2] 1/2 = 1, ds k = dT ds 1 a Remark: The curvature of a circle is the reciprocal of its radius. This is reasonable on two counts. First, the curvature is the same at all points of a circle. Second, it is small for large circles, since the larger the circle the more slowly its direction changes per unit of arc length. The Unit Normal The vector dT/ ds has length k, the curvature. Therefore dJ ds = /cN, F i g . 3.1 N 172 5. VECTOR CALCULUS where N is a unit vector in the direction of dT/ds. (We assume k 7^ 0.) Since T is a unit vector, T is perpendicular to dT/ds; this was shown in Example 1.4. The vector N is called the unit normal vector to the curve (Fig. 3.1). We summarize: Let x( t ) represent a curve in space. = kN, k = k (s). ds | T| = | N | = 1, T -N = 0. The further study of space curves, not pursued here, begins with an analysis of dt i / ds. That leads to another quantity, the torsion, which measures how fast the plane of T and N is turning around the tangent line. E XAM P L E 3.3 Compute T , N and k for the circular spiral (helix) x (t) = (a cos tya sin bt). Assume a > 0 and b > 0. Solution: The projection of x (t) on the x, ?/-plane is (a cos t, a sin t, 0). As a particle describes the curve x (t), its projection describes a circle of radius a. The third component of x( t ) is bt; the particle moves upward at a steady rate. Thus, the curve is a spiral; it is circular but steadily rising (Fig. 3.2). Differentiate x = (a cos t, a sin t, bt ) : x = ( a sin t, a cos t, b). Introduce c > 0 by c2 = a2 + b2. Then |x|2 = a2 + b2 = c2, ^ = |x| = Cy and 1 . 1 / T = - x = - ( a smt y a cos t} b). c c dT To find k and N , use the relation &N = : ds 3. Curvature 173 a k = , N = ( cos sin t , 0). c2 Since k > 0 and N is a unit vector, Answer: & a2 + 62 1 \ / a 2 + b2 N = ( cos t, sin^O) ( a sin a cos t , 6), Remark: If 6 = 0, the spiral degenerates into a circle of radius a and the curvature k reduces to 1/a, which agrees with Example 3.2. EXERCISES Find the curvature: 1. y = x2; at x = 1 2. x(0 = (3, 2); at t = 1 3. x (0 = (it, *2, *3); at 2= - 1 4. x(2) = (a\t a2t3, bit -f- b2t3, c\t + c2tz)', at t = 0. 5. Let x = x(s) be a plane curve. Show that dt i / ds = kT. [Hint: Differentiate T*N = 0 and N-N = 1.] 6. Find the point of the plane curve y = x2 where k is maximum. 174 5. VECTOR CALCULUS 7. Find the point of y = sin x where k is maximum, 0 < x < t . 8. Find the curvature of y = xs at x = 0 and at x = 1. 9. Show that the curvature of a plane curve at an inflection point is zero. 10. Let the tangent line of a plane curve intersect the z-axis with angle a. Show that k = \da/ds\. [.Hint: Write T = (cos a:, sin a:).] 11. A point moves along the curve y = ex at the rate of 3 in./sec. How fast is the tangent turning when the point is at (2, e2)? 12. Compute the maximum and minimum curvature of an ellipse with semimajor axis a and semiminor axis b. Check the case a = b. 13. From a graph, predict the behavior of the curvature of y = 1/x as x -------- >0 and as x -------- >oo . Verify your prediction. 4. VELOCITY AND ACCELERATION If x = x(t ) represents the position of a moving particle, its velocity is v ( 0 = x ( 0 and its acceleration is a (t) = v(0 =x(<). Velocity and acceleration are vectors, each having magnitude and direc tion. The direction of the velocity is the direction the particle is moving. The direction of the acceleration is the direction the particle is turning. The fol lowing example shows that the direction of the acceleration is not necessarily that of the velocity; it may even be perpendicular to the velocity. E XAM P L E 4.1 The path of a particle moving around the circle x2 + y2 = r2 is given by x (t) = (r cos cot, r sin ut ), where w is a constant. Find its velocity and acceleration vectors. Solution: Differentiate to find v and a : v() = x(t) = rco( sin vt, cos cvt), a (t) = v(t) = ra>2( cos co, sin coO = cv2x(t). Answer: v(t) = ra>( sin cot, cos wt), a (t) = rco2( coscot, sin cot). Remark: The speed, |v| = rco, is constant; the motion is uniform circular motion. The velocity v() is perpendicular to the position vector x(t) since x (t) v (t) = 0. This is expected since each tangent to a circle is perpendicular to the corresponding radius. But a (t) = co2x( t ) , so the acceleration vector a (t) is directed opposite to the position vector x (t ). See Fig. 4.1. What is the physical meaning of this phenomenon? 4. Velocity and Acceleration 175 v(* + h) - v(0 x(t + h) \ x ( 0 x F i g . 4.1 F i g . 4.2 Remember that a( t ) is the rate of change of the velocity vector. Observe the velocity vectors at t and an instant later at t + h. See Fig. 4.2. The difference y( t + h) y( t ) is nearly parallel, but oppositely directed, to x( t ) . Thus the velocity is changing in a direction opposite to x (t). It seems reason able, therefore, that a (t) = cx( t ) , where c > 0. But force and acceleration are vectors, both have magnitude and direction. Thus Newton's Law is a vector equation: E XAM P L E 4.2 A particle of mass m is subject to zero force. What is its trajectory? Solution: By Newton's Law, Newtons Law of Motion This famous principle states that force = mass X acceleration. F = mx. It is equivalent to three scalar equations for the components: Fi = mx i , F2 = mx2, Fz = mx3. mx = 0 , x = 0. Since x = v, Integrate once; v is constant: 176 5. VECTOR CALCULUS dx v = v 0, = v0. dt Integrate again: The result is a straight line. X = tv0 + Xo. Answer: The trajectory is a straight line, traversed at constant speed. Remark: Let us check the second integration in components. The equation dx ^ = Vo means %1 ^01> #2 = V02, #3 = ^03, where the v0j are constants. Integrating, X \ = tV0i H %0h # 2 = t v 02 + #02 > # 3 = tV03 + #03- Written as a vector equation, this is simply x = tv0 + x 0. E XAM P L E 4.3 A shell is fired at an angle a with the ground. What is its path? Neglect air resistance. Fi g . 4.3 Solution: Draw a figure, taking the axes as indicated (Fig. 4.3). Let v 0 be the initial velocity vector, so v 0 = ^o(cos a, sin a), where v0 is the initial speed. Let m denote the mass of the shell. The force of gravity at each point is constant, F = (0, - m g ) . d2x ma = F, that is, = (0, - g ) . 4. Velocity and Acceleration 177 The equation of motion is Integrate: dx = (0, gt) + v0. Integrate again, noting that x0 = 0 by the choice of axes: x = ( ~~ \ g t ) + <Vo' Hence ( x ( t ) , y ( t ) ) = ( o, - ^ g t 2^j + tv0(cos a, sin a) = ^Vot cos a, v0t sin a - g t . To describe the path, eliminate t : x x = Vot cos a, t = Vo cos a 1 0 y = Vot sin a - - gt2 = x tan a - ----- x2. 2 2v02 cos2a The graph of this quadratic is a parabola. Answer: x (t>0cos a), y = (^sina:)/ where is the initial speed. The path is a parabola: 0 2vq2 cos2a E XAM P L E 4.4 In Example 4.3, what is the maximum range? Solution: The shell hits ground when y = 0: This equation has two roots. The root t = 0 indicates the initial point. We want the other root, ^ 2v0 sin a 9 The range is the value of x at this time: 178 5. VECTOR CALCULUS Clearly x is maximum when sin 2a = 1, or a = tt/4. The maximum range is v<?/g. Remark: If the initial speed is doubled, the maximum range is quad rupled. Is this reasonable? (By what factor must the gunpowder be increased to double the initial speed?) The arc length s, the unit tangent T, the unit normal N, and the curvature k are geometric properties of a curve. If a particle moves on the curve, it is useful to express its velocity and acceleration in terms of these quantities. We already know which says that the motion is directed along the tangent with speed ds/dt. For further information, differentiate v with respect to time, using the Chain Rule carefully: Answer: The maximum range is v02/g- It is obtained by firing at 45. Components of Acceleration This is an important equation in mechanics. It says that the acceleration is composed of two perpendicular components. The first is a tangential com- 4. Velocity and Acceleration 179 ponent with magnitude s, the rate of change of the speed. The second is a normal component, directed along N with magnitude ks2. It is called the centripetal acceleration. E XAM P L E 4 .5 A particle moves along a circle. Find its velocity and acceleration. Solution: Let r denote the radius and let 0 = 0{t) denote the central angle at time t. Place the circle in the x, 2/-plane with center at 0. Then the path is given by x(t) = r (cos 0, sin 0). Differentiate: v = x = r&( sin0, cos 0). It follows that ds T = ( sin 0, cos 0), = rd = rco(t). dt Here co(t) = 6(t) represents the instantaneous angular speed. Thus v = ro?( sin0, cos0) = rcoT. Differentiate: a = v = rcjT + rcot = ro>( sin0, cos0) + rcv2( cos 0, sin0) = rcST + rco2N. Answer: v == ro>T, a = ro?T + ro>2N, where w = 6 is the angular speed. / 0 (b) (a) Fi g . 4.4 180 5. VECTOR CALCULUS R e m a r k : When the motion is uniform (co constant), then a = rco2N, so the acceleration is all centripetal, perpendicular to the direction of motion. This agrees with the answer to Example 4.1. Angular Velocity A rigid body rotates about an axis a through 0. See Fig. 4.4a. The central angle is 6 = 6 (t), so a> = 6 is the angular speed, the rate of rotation in radians per second. The angular velocity is defined to be the vector co having magnitude 6 and pointing along the (positive) axis of rotation according to the right- hand rule (Fig. 4.4b). Suppose the actual velocity v of a point x in the rigid body is required. How can it be expressed in terms of x and the angular velocity co ? See Fig. 4.5. F i g . 4.5 Since the point x is rotating about the axis of co, its velocity vector v is perpendicular to the plane of co and x. By the right-hand rule, v points in the direction of co X x. The speed |v| is the product of the angular speed a> = |co| and the distance r of x from the axis of rotation. But r = |x| sin 0, hence |v| = |co| |x| sin 0 = |co X x|. Therefore: The velocity of a point x in a rigid body rotating with angular velocity co is v = co X x. EXERCISES 1. A hill makes angle ft with the ground (Fig. 4.6). A shell is fired from the base of the hill at angle a with the ground. Show that the ^-component of the position where the shell strikes the hill is x = (2v^/g) (sin a cos a tan ft cos2a). 5. Integrals 181 x Fi g . 4.6 2. (cont.) Find the maximum of x as a function of a. Show that it occurs for 3. Let x(t) = (t, t2). Find v(0 and a (t). 4. (cont.) Find the tangential and normal components of a at t = 0 and t = 1. 5. A particle moves along the curve y = sinx with constant speed 1. Find the tan gential and normal components of a at x = 0 and at x = w/2. 6. Find the tangential and normal components of acceleration for x(t) = (cos t2, sin t2). 7. Find the tangential and normal components of acceleration for x (t) = (a cos wt, a sin cot, bt), where cois constant. 8. A particle moves with constant speed 1 on the surface of the unit sphere |x| = 1. Show that the normal component of the acceleration has magnitude at least 1. 9. A particle moves on the surface z = x2+ y2 with constant speed 1. At a certain instant t0 it passes through 0. Show that the tangential component of a is 0 and the normal component is (x(t0), jj(to), 2). Show also with xx + yy = 0 at t0. 10. Let x = x(t) be a space curve. Show that its curvature is k = |v X a|/|v|3. 11. The earth turns on its axis with angular velocity 360 per day. Find the actual speed (mph) of a point on the surface (a) at the equator, (b) at the 40-th parallel, (c) at the south pole. Approximate the earth by a sphere of radius 4000 miles. 5. INTEGRALS Suppose a particle moves along a path x = x (t) from x(fo) to x(i), acted on by a force F = F(). How much work is done by the force? From physics we learn that only the component of the force in the direction of motion does work, and that the amount of work done in a small displace ment of length ds is where Fa is the average component of force in the direction of motion. d W = Fa ds, 1 82 5. VECTOR CALCULUS F Fi g . 5.1 = x(0 Draw the unit tangent T , the force F, and a small portion of the path of length ds. See Fig. 5.1. Since T is a unit vector in the direction of motion, the component of the force F in the direction of motion is the dot product Hence F -T . dW = (F -T )ds. Replace ds by (<ds/ dt ) dt. This makes good physical sense: The length ds traveled in a short time period dt is given by speed X time = (ds/ dt ) dt . The result is . ds But hence dW = F -T ^-dt. dt ds dx dt dt d W = (F -v ) dt We define the total work done by an integral that adds up these small bits of work from x (to) to x ( h ) : Since ( dx/ dt ) dt = dx, the integral can be written fx(ti) W = F-dx. j X(fo) This type of integral is called a line integral. It arises naturally in con nection with work, but has many other practical applications in physics. The evaluation of a line integral involves nothing more than the evaluation of an ordinary integral. 5. Integrals 183 Suppose a particle moves on a curve x (t) from x (to) to x (ti) and is subject to a force F (t). Then the work done by the force is given by the line integral J to \ dt / yx(*o) In the integral on the right, d t ) Let F() = (Fi (t ), F2(t), Fs (t )). Then the line integral is evaluated as an ordinary integral: E XAM P L E 5.1 rx( 3) Evaluate the line integral / F*dx, where F = (3, 1, 2) and Jx( o) x ( 0 = (<, t \ tz). Solution: dx = (1, 2t, 3t2) dt, F-dx = (3, - 1 , 2)- (1, 2t, 3t2) dt = ( 3 - 2 1 + dt2) dt. Therefore f X(3) /*X(3) p / _ F-dx = ( 3 - 2 1 + 6t2) dt = (31 - t2 + 2ts) 0) = (9 - 9 + 54). Answer; 54. E XAM P L E 5.2 Under the action of a force F (t), a particle moves on a path x ( 0 from x (to) to x (h). Let W denote the work done by the force. From Newton's Law, show that W = \ m |v (^i)|2 \ m |v (^o)[2. Solution: W = I 1 F* x dt. J to According to Newton's Law, F = mx, so F*x = mx' x. 184 5. VECTOR CALCULUS But observe that Therefore so . d . . 2x* x = (x* x). dt - . 1 d -x 1 d . io F* x = - m (x* x) = ~ m v 2, 2 d r ' 2 d r 1 p i r fi i d i i T7 = / F * x d f = / ~ ? n -| v | 2 d / = -m | v(^i )| 2 - -m | v(^o)| 2. ./o J to * dt 2 2 1 Re ma r k : The quantity Jm|v|2 is the kinetic energy of the particle. The result of this example is the Law of Conservation of Energy: work done equals change in kinetic energy. Integrals of Vector Functions Next we define the integral of a vector function as a componentwise operation. Suppose u (t) = M O , u2(t), u3(t )) is defined for a < t < b. Then laU(t)dt = (l fa i: Uz(t) dt'j . Notice that the integral of a vector function is a vector, whereas a line integral is a scalar. E XAM P L E 5.3 Let u (0 = (1, t 1, 2). Find J u( t ) dt. Solution: t2) d t J u (t) dt = J (1, t 1, i n r , 5. Integrals 185 The integral of a vector function u (t) is particularly easy to evaluate if an antiderivative of u (t) is known. To prove this, simply check the three components; each is the integral of a derivative. Here is an example: P f 2d / (21, 3t \ 4*3) dt = - (t2, t \ t4) dt = (t2, *3, ^ <4) = (4, 8, 16). 0 Momentum The following applications show the importance in physics of vector valued integrals. A particle of mass m has position x = x (t) at time t and moves under the action of a (variable) force F, so that dx F = mx = m . dt Integrate with respect to t on the interval to < t < t\: [ 1 Fdt = f 1 J <0 J to dx m dt, dt hence, [tl / F dt = mx( t \ ) mx(to). J to The quantity mx is the momentum of the particle, so the right-hand side is its change in momentum. The left-hand side is called the impulse of the force during the time from t0 to t\. This equation is a form of the Law of Conservation of Momentum: impulse equals change in momentum. The angular momentum of the particle with respect to the origin 0 is defined as mx X x. 1 86 5. VECTOR CALCULUS since x X x = 0. But mx = F by Newtons Law of Motion; hence mx X x = x X mx = x X F, which is the torque of F at x. The result is that d [mx X x] = x X F. dt Integrating, mx X x 11 rti = / x X <0 J <0 F dt. This result, called the law of Conservation of Angular Momentum, asserts that the change in angular momentum during a motion is the time integral of the torque. EXERCISES Evaluate: /*x(i) 1. / (t, 2t, 3t)-dx; x(t) = (1, t, t2) Jx(0) fx( T) 2. / (cos t, sin t)*dx; x(t) = (sin 2, cos 0 Jx(0) /*X(2ir) 3. / (0, 0, 3t)*dx; x(0 = (cos 2, sin 2, 2+ 1) Jx(0) f x (1) 4. / (e*, e*, e*)'dx; x(t) = ( t I , t I, t) J x i - 1) A 1.1.D 5. / ( t , t2, t)dx; straight path y (o.o.o) I*X(27r) / -- y \ 6' i , V + 7 i ? + ? l ' hi /* x(i) 7. / xefc + i/<ft/; x(0 = (?, *3) Jx(o) r (-1.0 4) 8. / z dy + y dz + 2 dx; straight path, y (1.1.2) 9. Find the work done by the uniform gravitational field F = (0, 0, <7) in moving a particle from (0, 0, 1) to (1, 1, 0) along a straight path. 10. Find the work done by the central force field F = x in moving a particle from (2, 2, 2) to (1, 1, 1) along a straight path. Evaluate: 11. j (1 -f- t, 1 -|- 2t, 1 -|- 3/) dt 12. J (cos t, sin t, 1) dt u i ' 0 - ? . ? ) * 5. Integrals 187 15*. Let x = x (0, where a < t < b, be a plane curve which does not passthrough (0, 0). i 1 /*x(6) Show that - (x dy y dx) is the area in Fig. 5.2. 2 Jx( a) Show that t is related to the area A by t = 2A/ab. See Fig. 5.3. [Hint: Use Ex. 15.] t, b sinh t) Fi g . 5.3 cosh 188 5. VECTOR CALCULUS 17. A circle of radius a rolls along the -axis. Initially its center is (0, a). The point on the circle initially at (0, 0) traces a curve called the cycloid. Show that x(0) = aid sin 0, 1 cos 0) is a parametrization of the cycloid, where 0 is the angle at the center of the rolling circle, measured clockwise from the downward vertical to the moving point. Graph the curve. 18. (cont.) Show that the area under one arch of the cycloid is 3 times the area of the circle. [Hint: Express y dx in terms of 0 and integrate.] 19. (cont.) Find the arc length of one arch of the cycloid. 20. Show that the witch of Agnesi y = 1/(1 + x2) is parameterized by x = x(0) = 2(cot0, sin20). Sketch the curve. 21. Show that the folium of Descartes x* + y* = 3axy is parameterized by x = x(t) = (1 + t*)~l (3at, 3at2). Sketch the curve. 22. The force F(2) = (1 t, 1 t2, 1 t3) acts from t = 0 to t = 2. Find its im pulse. 23. The force F(t) = (e*, e2t, eu ) acts from t = 1 to t = 0. Find its impulse. 24. An electron in a uniform magnetic field follows the spiral path x(t) = (a cos t, a sin t, bt). Find its angular momentum with respect to 0. 25. A particle of unit mass moves on the unit sphere |x| = 1 with unit speed. Show that its angular momentum with respect to 0 is a unit vector. 6. POLAR COORDINATES Review The polar coordinates of a point x ^ 0 in the plane are the distance r = |x| of x from 0, and the angle 0 from the positive x-axis to the vector x, measured counterclockwise. The angle 0 is determined up to a multiple of 2ir. We shall write polar coordinates {r, 0} with curly braces to distinguish them from rectangular coordinates ( x, y ) . Any value of r is allowed, even r = 0 (the origin) and negative r; the point { r, 0} is the reflection of {r, 0} through the origin and is identical with {r, 0 + 71-}. Note that 0 is undefined at the origin. The rectangular coordinates (x, y) of the point with polar coordinates {r, 0} are given by x = r cos 0 y = r sin 0. Conversely, given (x, y) we find {r, 0} from 6. Polar Coordinates 189 (The single formula for the angle, tan 0 = y/ x, is not adequate to distinguish quadrants. For example, tan 0 = tan t = 0.) The number r is called the radius of the point x = {r, 6} and 0 is called the polar angle. A curve may be presented by a relation between r and 0, often in the form r = f (6). For example, r = a is the equation of the circle of radius a with center 0. The graph of r = cos 0 is also a circle, but with center (J, 0) and radius , because of the computation r = cos 0, r2 = r cos 0, x2 + y2 = x, (x2 - x) + y2 = 0, ( x - 0 + y2 = ^ Length Suppose a curve is given in the form r = r( t ) , 6 = 6 [t), U < t < t%. What is the length? Write x = (r cos 6, r sin 6) = r (cos 0, sin 6). Differentiate: x = r(cos0, sin0) + r0( sin0, cos0) = ru + r0w, where u = (cos 0, sin 0) and w = ( sin 0, cos 0). Now = (ru + r0w)* (ru + r0w) = r2u*u + 2rr0u*w + r202w*w. But u*u = 1, w*w = 1, and u*w = 0. Hence, (!) - ' + It follows that Length = j y / f 2 + r202dt. J to Figure 6.1 provides an aid to memory. The right triangle has sides dr, r dd, and ds, so the Pythagorean Theorem suggests (ds)2 = (dr)2 + r2(dd)2. 190 5. VECTOR CALCULUS y Fi g . 6.1 x Suppose a curve is given by r = r (6). This is a special case of the previous situation with 0 = t and r = r(t ). The length formula specializes to Area There are problems that require the area swept out by the segment joining 0 to a moving point on a curve (Fig. 6.2). Suppose the curve is given by r = r(t)y 0 = 0(0, t o < t < h . In a small time interval dt, a thin triangle of base r dd and height r (ignoring E XAM P L E 6.1 Find the length of the spiral r = 02, 0 < 0 < 2tt. Solution: 8 8 Answer: L = - (ir2 + 1)3/2 ~ o o 6. Polar Coordinates 191 Fi g . 6.2 negligible errors) is swept out (Fig. 6.3). Hence 1 1 dd dA = - r 2dd = - r2 dt, 2 2 dt Fi g . 6.3 In the special case that the curve is given by r = r (6) for do < 0 < 0\, choose t = 0. The, formula specializes to E XAM P L E 6.2 Find the area of the four-petal rose r = a cos 26. 192 5. VECTOR CALCULUS Solution: Graph the curve carefully (Fig. 6.4). The portion on which 0 < 0 < t / 2 is emphasized. Note that r < 0 for 7r/4 < 0 < ir/2. Because of symmetry it suffices to find the area of half of one petal. Thus f r / 4 ^ [ * / 4 f r / 2 A = 8 / - (a cos 20)2dd= 4a2 / cos220 J0 = 2a2 / cos2 1dt. J o 2 7 0 Answer; A 7ra^ Fi g . 6.4 Summary Suppose a curve is given in polar coordinates by r = r( t ) , 0 = 0(0, U < t < t \ r = r(0), a r c l e n g t h : < 0 < 0i . a r e a : Hi 1 ,70 f d1 1 A - - * , ) * >. EXERCISES Express in rectangular coordinates: 1. { l , i / 2 } 2. {1, 3ir/2} 7. Polar Velocity and Acceleration 193 3. {2, tt/4} 5. {1, tt/6} 4. { - 4 , 3tt/4} 6. {2, 5tt/6}. Express in polar coordinates: 7. ( 1 , - 1 ) 9. ( - V 3 , 1) 8. ( - 1 , - 1 ) 10. ( y/ 3/ 2). 11. Find the length of the spiral of Archimedes r = ad from 6 = 0 to 0 = 1. 12. Set up an integral for the length of the four-petal rose r = a cos 20. 13. Set up an integral for the length of the three-petal rose r = a cos 36. 14. Find the length of the one-petal rose r = a sin 6. Precisely what is this curve? 15. Find the area enclosed by the three-petal rose r = a cos 36. 16. Find the area enclosed by the (2n + 1)-petal rose r = a cos (2n + 1)6. 17. Find the area enclosed by the 4n-petal rose r = a cos 2nd. 18. Find the area enclosed by the curve r = a cos22nd. 19. Find the area enclosed by the cardioid r = a( 1 cos6). 20. Show that the cissoid of Diodes y2= x3/ (a x) can be expressed in polar coordinates by r = a (sec 6 cos 6); sketch the curve. 21. Find the area enclosed by the figure eight, the lemniscate of Bernoulli r2= a2cos 26. Sketch the curve. 22. Sketch the strophoid r = a cos 26 sec 6. Find the area of the closed loop. 23. Sketch the limagon of Pascal r = b + a cos 6 in the three cases 0 < a < 6, 0 < a = b, and 0 <. b < a. In the third case compute the area between the two as the path of a particle. What are its velocity and acceleration vectors? In the above discussion of length, the perpendicular unit vectors were introduced. In using polar coordinates, it is natural to express the velocity and acceleration in terms of these vectors (Fig. 7.1). Note that x = ru and loops. 7. POLAR VELOCITY AND ACCELERATION [optional] Let us think of a curve given by r = r( 0, 0 = 0( 0 u = (cos 0, sin 0) and w = ( sin 0, cos0) that u = 6w, w = 0u. Differentiate x = ru: v = x = ru + rd w. Differentiate again: a = v = (ru + rd w) + (r0w + r0w r02u). Hence a = (r rd2) u + (rd + 2r0)w. Let us apply this formula to motion involving a central force F =f ( t ) u . At each instant, the force is directed toward or away from the origin. Since ma = F, the component of a in the direction of w is zero: rd + 2rd = 0, that is, ^ r2d + rfd = 0. 194 5. VECTOR CALCULUS 2 This is the same as But ( - r2d J = 0, that is, - r2d = constant. dt \ 2 / 2 1 . dA - r2d = , 2 dt the rate at which central area is swept out by the curve. It follows that the same area is swept out in equal time anywhere along the path. This is Keplers Second Planetary Law; it is a case of the Law of Conservation of Angular Momentum. Summary Suppose a curve is given in polar coordinates by r = r( t ) , d = d(t). VELOCITY: 7. Polar Velocity and Acceleration 195 ACCELERATION I a = f - r ^Y l u + Tr + 2 l w [_dt2 \ d t ) J L dt2 dt dt j 1 where u = (cos 0, sin 0) and w = ( sin 6, cos 8). EXERCISES This set of exercises develops Keplers First and Third Laws of Planetary Motion. Assume a particle of unit mass is moving in a central force field given by the inverse square law: F - - I u . r2 J 2 1 1. Show that the equations of motion are r28 = J, r ------ = ---- -, where J is a r r2 constant. J 2 2 2. Show that r2-\r- = - + C, where C is a constant. This equation is essentially the r r Law of Conservation of Energy. [Hint: Multiply the second equation in Ex. 1 by f and integrate.] 3. Set p = - . Show that + J 2p2 = 2p + C. r p 4. Imagine 6 = 8{t) solved for t as a function of 8 and this substituted into p = p(t). Thus p may be considered as a function of 8. Show that p (I )2= p - and conclude that P[(I )2+p2] = 2p + C. d?p 1 5. Show that + p = -==. [Hint: Differentiate the previous relation.] du J 1 6. Show that p = A cos0+ Bsi n8- \ - l / J 2, where A and B are constants, is a solution of the preceding differential equation. (In Chapter 14 it is shown that every solution is of this type.) 7. Show that by a suitable choice of the z-axis, the solution may be written - = -ji (1 e cos 8), where e is a constant, the eccentricity of the orbit, e > 0. T J 8. By passing to rectangular coordinates, show that the orbit is a conic section. 9. Suppose e = 0. Show that the orbit is a circle with center at 0, and that the speed is constant. 10. Suppose 6 =1 . Show the orbit is a parabola with focus at the origin and opening in the positive z-direction. 196 5. VECTOR CALCULUS 11. Suppose e > 1. Show that the orbit is a branch of a hyperbola with one focus at the origin. (x c )2 y2 12. Suppose 0 < e < 1. Show that the orbit is the ellipse----^------\- ^ = 1, where J2 J2 where a = ------ 5, b = . .... - , and c = ae. 1 eL v 1 e2 [By Ex. 8-12, each closed orbit is an ellipse (or circle), Keplers First Law.] 13. (cont.) Show that a2= b2+ c2. Conclude that the foci of the ellipse are (0, 0) and (2c, 0). 14. (cont.) Let T denote the period of the orbit, the time necessary for a complete revolution. Show that ~ T = t ab. [Hint: Use Keplers Second Law.] z 15. Conclude that T2= 47r2a3. This is Keplers Third Law: The square of the period of a planetary orbit is proportional to the cube of its semimajor axis. 6 . Fu n ctio n s o f S e ve ra l Va ria b le s 1. INTRODUCTION Elementary calculus is concerned with functions such as y = f ( x ) , where one quantity y depends on another quantity x. In all sorts of situations, how ever, a quantity may depend on several variables. Here are two examples: (1) The speed v of sound in an ideal gas is where D is the density of the gas, p is the pressure, and 7 is a constant charac teristic of the gas. Then v depends on (is a function of) the two variables p and D. We may write v = f ( p , D) , or, v = v(p, D) . (2) The area of a triangle with sides x, y, z is A = \ / s (s - *) (s y ) (s - 2), where s is the semiperimeter \ (x + y + z ). Then A depends on the three variables x, y, and z. We may write Note that x, y, z are not three arbitrary numbers but must satisfy the in equalities x > 0 , y > 0 , z > 0 and z < x + 2/, x < y + z, y < z + x. In Example (1), the quantity v is a function of p and D defined for a certain set of pairs (p, D) , which we can think of as a subset of the p, Z)-plane. In Example (2), the area A is a function of x, y, z defined for a certain set of points (x, y , z ) in space. In general, let S be any subset of the plane R2 or space R3, and let / be a real-valued function defined on S. Thus / assigns a real number to each point in S. We write and we say that S is the domain of /. Alternatively, we say that / is a function with domain S. Here R denotes as usual the set of all real numbers. A = f ( x , y , z ) , or A = A ( x , y , z ) . 198 6. FUNCTIONS OF SEVERAL VARIABLES We want to extend the concepts of one-variable calculus to functions of several variables, concepts such as continuity, derivative, and integral. Now in the one-variable situation, in order for these concepts to be meaningful, the domain of a function has to be a reasonably nice set, generally an interval or the union of several intervals. For example, the integral is usually defined for a function on a closed interval. For a function of several variables, the nature of the domain is equally important. Therefore, we devote the next section to the study of properties of useful domains. Set Notation Let us review some standard notation that is useful in dealing with domains. The notation S = {# | P{ x ) } is read S is the set of all x with property P ( x ) . For example, an open interval is defined by (a, 6) = [x | a < x < b}. We even allow infinite open intervals such as (a, oo ) = {x \ a < x}. A closed interval is defined by [a, 6] = {x \ a < x < b}. The square brackets indicate that end points are included, the round brackets that they are excluded. Sometimes we used mixed types such as [a, b) = {x \ a < x < b}. Each of these sets is a subset of the reals R, and we write, for instance, (a, b) C R . In general, S CT. i s read S is a subset of T, and it means that each point of S is a point of T. If x is a point of S, we write x 6 S and read x belongs to S. Given two sets S and T, we can form two other sets from them. First, their intersection is S nT = | S and x GT}. It consists of all points both in S and T. For example, let a < b < c < d. Then (a, c) fl (b, d) = (b, c). This means that both a < x < c and b < x < d if and only if b < x < c. The union (also called join) of S and T is S U T = { # | S or x T or both}. It consists of all the points of S and all the points of T thrown together. For example, the domain of F ( x ) = y / x 2 I is {# | < 1 or > ! } = ( , 1] U[ 1, oo ). 2. Domains 199 Another example is (- oo, 1) U (- 1, oo) = (- 00, oo) = R . In this case the two sets overlap, indeed, ( , 1) n ( 1, oo) = ( 1, 1). Still, each point of R is in either ( oo, 1) or ( 1, oo), some are in both. In set theory there is something called the empty set, the set with no points at all. We have no real use for this here, so it will be understood, without further notice, that whenever we say set we mean a non-empty set, a set with at least one point. 2. DOMAINS In this section, we discuss closed sets and open sets in the plane and in space. These are sets that share certain basic properties with closed and open intervals on the line. A closed interval D = [a, 6] has this property: if x n G D and x n -------- x, then x D . In other words if a sequence of points in D converges, it converges to a point in D . Not all intervals have this property. For example, take D = (0, 1) and x n = 1/n. Then x n D , the sequence {#n} converges, but lim x n D . Convergence We need the notion of convergence of a sequence of points in space. Henceforth, to avoid repetition, we shall use the word space to mean either two-space (plane) R 2 or three-space R 3. The definition of convergence in space looks just like the definition on the line with the nearness of points x and y measured by |x y|, the dis tance between the points. D e fi n i ti o n Let {xn} be a sequence of points in space and a another point. We say {x n } converges to a, and write xn------------>a or a = lim x n n - + oo provided |xn a | -------- >0 as n -------- >oo. From the given sequence {xn} and point a, we construct a new sequence {|xn a | } of real numbers and we ask whether this sequence converges to 0. If yes, we say xn--------->a. Just how close this definition is to the old definition of convergence for sequences of real numbers is seen in the following result, which interprets convergence in terms of coordinates. T h e o re m Let xn = (xn, y n, zn) and a = (a, b, c ). Then xn--------->a if and only if x n -------- a, y n -------- * b, and zn -------- >c. 200 6. FUNCTIONS OF SEVERAL VARIABLES Proof: Suppose x n -------- >a, y n -------- >6, and zn -------- >c. Then Ix n a \ -------- >0 so (;xn a) 2-------- * 0. Similarly, (yn b) 2-------- >0 and (zn c) 2-------- >0. Therefore |x - a| 2 = (xn - a) 2 + (yn ~ b) 2 + (zn - c) 2-------- >0, and it follows that |xn a | -------- 0 so xn-------- >a. Conversely, let xn-------- >a. Then |xn a | -------- >0. But IXn - a|2 = (xn - a) 2 + (yn - b)2 + (zn - c) 2 > (xn - a)2, so |xn a| > \xn a\ > 0 and it follows that \xn a \ -------- 0, that is, x n -------- >a. Similarly, ?/n-------- 6 and -------- * c. Closed Sets Now that we know what it means for a sequence of points to converge, we can define closed sets. Definition A set S in space is closed provided each limit of a sequence of points taken from S is itself in S. That is, if xn S for n 1, 2, 3, and if xn-------- >x, then x f S . The intuitive idea of a closed set is a clump S of points in space which include all of its boundary points. Any point of space that you can sneak up on by points of S must be a point of S. The following examples should help. E XAM P L E 2.1 (C lo se d H a lf-P la n e ) Let a, by c be given with a2 + b2 ^ 0, and let H = { (x, y) | ax + by > c\ . Prove that H is a closed set. (a) closed half-plane; (b) elliptical region see Example 2.1 see Example 2.2 Fi g . 2.1 2. Domains 201 Solution: Suppose xf H and xn-------- >x. Prove that x 6 H. Now xn = (#, 2/n)-------- >x = (#,?/) and a#n+ byn > c. Since xn-------- >x, we have #n-------- >x and yn -------- >y. Therefore a#n+ byn -------- ax + by. But axn + byn > c, hence ax + by > c, that is, (x, y) H. See Fig. 2.1a. EXAMPLE 2.2 Let a > 0 and b > 0 and set E = (*, y) - + - < i t . a2 - J Prove that E is a closed set. Solution: Let (xn, yn) E and (xn, y n) -------- * (x, y). To prove: (x, y) 6 E. Now #n-------- >x and 2/n-------- ?/, hence &* _____ ^z2 ^ a2 62 * a2 D2 But #n2/a2+ yn2/b2 < 1. Therefore x2/ a 2 + y2/b2 <1 so (x, y) G E. See Fig. 2.1b. Let S and T be closed sets that have common points. Prove that S fl T is closed. Solution: Let xn S n T and xn-------- >x. To prove: x S flT. Now xn S and S is closed, so x S. Likewise x 6 T, so x S n T. (a) intersection of five closed half-planes (b) intersection of two closed half-planes and an elliptical lamina (c) intersection of three elliptical laminas Fi g . 2.2 An intersection of closed sets again a closed set. 202 6. FUNCTIONS OF SEVERAL VARIABLES R e m a r k : I t follows easily that the intersection of any number of closed sets, if non-empty, is another closed set. We can use this result to construct many closed sets. See Fig. 2.2 for some examples. EXAMPLE 2.4 Let S and T be closed sets. Prove that S UT is a closed set. Solution: Let xn S UT and xn-------- >x. To prove: x S UT. Now either infinitely many of the points xnbelong to S or infinitely many belong to T (or both). For otherwise there would be only a finite number of points xn. Suppose infinitely many belong to S. That means there is a subsequence {xni} of {xn} with xnj S. Since xn-------- >x, then also xnj -------- >x. But S is closed, so x S. Hence x S UT. Open Sets Definition Let S be a set in space and x0 a point of S. Then x0 is called an interior point of S provided for some 8 > 0, {x | |x x0| < 5} C S. I nterior points are important because we have a fighting chance to define derivatives at interior points of the domain of a function. See Fig. 2.3 for some plane examples that illustrate the concept. Note that if x0is fixed, then {x | |x x0| < 5} is the circular disk of radius 8 and center x0, without the boundary circle. Fig. 2.3 D = {(, y) | 0 <x < 1, 0 <y < 1}, unit square { II, V, w are interior points x, y are not 2. Domains 203 I n the figure, u is an interior point of D because there is a circular disk centered at u and entirely inside D. Likewise v and w are interior points of D. But x is not an interior point of D because any circular disk centered at x has points outside of D. Likewise y is not an interior point. The interior points are precisely those points (x, y) where 0 < x < 1and 0 < y < 1. The remaining points of D are not interior points. They are the points of D where x = 0 or x = 1 or y = 0 or ?/ = 1, that is, the points on the boundary of the square. Now consider Fig. 2.4. The set D is the unit disk without its boundary. Every point of D is an interior point! For if |x0| < 1, then 8 = 1 |x0| > 0, and if |x x0| < 8, then Ix| = I (x - Xo) + Xo| < |x - Xo| + |x01< 8 + |x0| = 1, so x 6 D. D = {(x, y) \ x2+ y2 <1}, open unit disk Fig. 2.4 Every point of D is an interior point. I n R3, the set {x | |x x0| < 8} is a ball of radius 8 and center x0, con sisting of all points inside the spherical surface |x x0| = 8. Defi ni ti on A set D is open if every point of D is an interior point. The set in Fig. 2.4 is open; that in Fig. 2.3 is not open because the points on the boundary of the square are not interior points, but they are points of the set. 204 6. FUNCTIONS OF SEVERAL VARIABLES EXAMPLE 2.5 Let a > 0 and b > 0, and set (*>y) - + f- < 1 a2 62 Solution: Suppose (x0, yo) D. To prove: (#o, 2/0) is an interior point of D, that is, that there is a positive 8 so small that (x, y) D whenever |(s, y) - (x<>, y0)I < S. We have x02/ a 2 + yo2/b2 < 1. The idea is to choose 8 so small that x2/ a 2 + y2/b2 is very close to x02/ a 2 + y02/b2 whenever |(x , y) (x0, yo)\ < 8. How close? Closer than e = 1 (x02/ a 2 + yo2/b2). Then x2/ a 2 + y2/b2 must be less than 1. Now f ( x ) = x2/ a 2 and g(y) = y2/b2 are continuous, so we can make 1/0*0 f ( xo)l < h and |g(y) g(yo)\ < \e by demanding that x be suffi ciently near to x0 and y to y0, say |#rc0| < 8 and |y y0\ < 8. Then i / x 2 y2\ _ , y f \ I \ a2 b2) \ a2 b2 ) = I/() - / Oo) + g(y) - g(yo)\ < I /() - / ( o)| + Ig(y) - g(y0)\ < h + h = W a r n i n g : Despite what the words seem to suggest, not every set is either open or closed. For example, take a disk in the plane together with some, but not all, of its boundary points. For a more unusual example let S = {(x, y) \ x and y are rational numbers}. Then S is very far from being either open or closed. Can you see why? ->a and yn- ->c and xn - - x and yn- >b. Prove that xn+ yn ->a. Prove that cnxn------ ->x. Prove |xn| - x and yn----- -^y. Proveycn*yn- ------>1x1. 1. Let xn- 2. Let cn 3. Let xn- 4. Let xn- 5. Let xn- Prove each of the following sets is closed and sketch: 6. { ( x , y ) \ y < x2\ 7. { { x , y ) \ y > x2} 8. {(x, y) | x2/a2+ y2/b2> 1} EXERCISES a b. ->y. Prove xnXy* x-y. ^xXy. 3. Continuity 205 9. { (x, y ) | x2/a2 y2/b2 < 1} 10. {(x, y, z) I x2 + y1 + z2 < 1}. 11. Prove that R3is a closed subset of R3. 12. Show that {x | 0 < |x| < 1} is not a closed set. 13. Prove that {(x, y) | 0 < x < 1} is an open set. Sketch it. 14. Prove that S UT is open if S and T are. 15. Suppose S and T are open and have common points. Prove that S fl T is open. 16*. Prove that S is closed if and only if its complement (the rest of space) is open or empty. Is the set open or closed or neither? 17. {{x, y ) \ 2 x + Z y > 1} 1 18. the domain of the function f ( x, y) = x y 19. {(x, y) | 0 < x < 4, 2 < y < 3} 20. {(x, y) j 0 < y < ex\ 21. {(x, y ) |x is an integer} 22. {(x, y) I neither x nor y is an integer}. 23*. Prove the Cauchy criterion: A sequence {xn}of points in R3converges if and only if for each e > 0, there exists a positive integer N such that |xmxn| < e when ever m, n > N. [Hint: Use the one variable case, p. 5.] 24*. Prove the Bolzano-Weierstrass Theorem: If {xn}is a bounded sequence of points in R3, then there is a convergent subsequence. You may presuppose the result for R. 3. CONTINUITY For a function of several variables to be useful, it must have some reason able properties. The most basic of such properties is continuity. Here is the formal definition, a direct generalization of the definition of continuity for a function of one variable. Defi ni ti on Let /: D -------- >R, where D, the domain of/, is a subset of R2 or R3. Let a be a point of D. We say / is continuous at a if / ( x ) -------- /(a) as x --------- >a. Precisely, for each e > 0 there exists 8 > 0 such that | / ( x ) / ( a ) | < ewhenever x D and |x a| < 8. We say / is continuous on D if / is continuous at each point of D. As for functions of one variable, this definition requires that a continuous function be predictable; you should be able to predict the value of the function at a from its values near a. The elementary properties of continuous functions of one variable carry over to this case easily. I n particular, sums, products, and quotients (with non zero denominator) of continuous functions are continuous. 206 6. FUNCTIONS OF SEVERAL VARIABLES Obviously the functions defined by f ( x, y) = x and g(x, y) = y are con tinuous. By forming products and sums we conclude that each 'polynomial is a continuous function on R2. (Of course the same holds on R3.) From this we deduce that each rational function is continuous wherever the denominator is not zero. (Recall that a rational function is a quotient of polynomials.) I t is also not hard to see that if f ( x ) and g{y) are continuous functions of one variable, then h( x, y) = f {x) g (y) is a continuous function of two variables. (See Ex. 2.) For example, the continuity of h( x , y ) = x In y follows from that of f ( x ) = x and g(y) = In y. Composite Functions Suppose we wanted to prove that f ( x, y) = yx is continuous on the domain x > 0, y > 0. We could write f ( x, y) = exlny = e0(xv). T hus/(x, y) the composite of the continuous functions h (t) = el and t = x In y. I t seems reasonable that f ( x, y) is continuous also. Here is a more complicated type of example. Suppose we somehow manage to prove that K(x, y, z) = I (yz + t4) sin (zt2) dt Jo is continuous in x , y, 2-space. We want to conclude that K (u v, vu, uv) is con tinuous on the domain u > 0, v > 0. What we need is the following theorem. Theorem Let K( x, y, z) be continuous on a domain D in x, y, 2-space. Let /, g, h be continuous on a domain E of the u, y-plane, and suppose that ( f ( u , v ) , g ( u , v ) , h ( u , v ) ) D whenever (u, v) E. Then the composite function k(u, v) = K[ f ( u , v), g(u, v), h(u, v)] is continuous on E. Proof: Let (u, v ) -------- >(uo, v0). Then f ( u, v ) -------- >/(wo, ^o), g(u, v ) -------- >g(uo, Vo), and h(u, v ) -------- h(uo, v0). Hence (/(, v) , g( u, v) , h( u, v ) ) -------- >( f ( uo, Vo),g(uo, v0), h(uo, v0)). But K is continuous, so k(u, v) = K[ f ( u , v), g(u, v), h(u, t>)] -------- > Vo), g(uo, Vo), h(u0, o)] = k(uo, v0), therefore k is continuous. N o t e : The theorem above is stated for a function of three variables, where each variable is replaced by a function of two variables. Clearly, there 3. Continuity 207 is nothing special about three and two, and the result may be modified as needed. Maxi ma and Mi ni ma One of the main concerns of calculus is maximum and minimum values of functions. Recall one of the basic facts about continuous functions of one variable: I f / is continuous on a closed interval [a, 6], then there exist points x0 and x\ in the interval such that f i x o) < f i x ) < f i x i) for all x [a, 6]. The result says that f (xo) = min{ f i x ) | a < x <b} , f i x i) = max{ f i x ) | a < x < b}. I f the interval is not closed, then / need not have a maximum or a minimum. For example, f i x ) x has neither a maximum nor a minimum on the open interval (a, b). The same holds for any continuous increasing or decreasing function. Furthermore, the result is not true on a domain which is unbounded, that is, contains points arbitrarily far from the origin. For example, on the domain [0, ), the function f i x ) = e~x has a maximum but no minimum; on (0, oo) it has neither. The correct generalization of the preceding theorem requires a domain that is both closed and bounded. Defi ni ti on A subset S of space is bounded if there is a number B such that |x| < B for all x S. Thus a set is bounded provided it is contained in some sphere of finite radius centered at the origin. Theorem Let S be a bounded and closed subset of space and let / have domain S. Then there exist points x0 and Xi of S such that /(Xo) < f i x) < / ( x :l) for all x S. A complete proof of this fundamental result is best postponed to a more advanced course; here is a brief sketch of a proof. Let M = sup {/ (x) | x S}. (Possibly M = oo.) Choose xn = (xn, yn, zn) S so that / ( x n) -------- M . 208 6. FUNCTIONS OF SEVERAL VARIABLES The sequence {xn}is bounded (because each xn is in S and S is bounded). Therefore {xn}has a convergent subsequence. We may restrict attention to this subsequence only. Hence we may assume x n -------- >a. Next we look at {yn}- Again, passing to a subsequence, we may assume yn -------- >b. Similarly we may assume zn-------- >c and hence xn-------- >a = (a, b, c). Then a S because S is closed, and f ( x n) -------- >f (a) because / is continuous. But /(xn) -------- >M , therefore M = / ( a). A similar argument applies to the minimum. Uni form Continuity The final property of continuous functions we shall consider is the property of uniform continuity. I t is important in the theory of integration. Defi ni ti on Let / be defined on a set S. Then / is called uniformly continuous on S if for each e > 0 there is a 8 > 0 such that l / ( x ) - / ( z ) | < whenever x and z are in S and |x z| < 8. At first reading, this definition may appear to be the same as the definition of continuity. The point, however, is that given e, the same 8 works throughout the domain of /. I n other words, the 8 given by the definition of continuity is independent of the point x. I t is rather obvious that each uniformly continuous function is continuous. The converse is not true in general (see Ex. 5). The converse is true, however, provided the domain is closed and bounded. Theorem Let S be a bounded and closed subset of space and let / be continuous on S. Then / is uniformly continuous on S. The proof of this theorem, like the last one, is best postponed to a later course. However, for the brave we offer the following: Suppose for some e> 0 there is no 8 > 0 that fills the bill. Then we can choose xra S and yn S such that |xn yn\ -------- 0 and |/(xn) /(yn) | > e. As in the proof sketched for the previous theorem, by passing to a subsequence we may assume xn-------- >a S. Then yn-------- >a also, since yn = xn + (y xn) and |y - xn\ -------- >0. Therefore / ( x n) -------- >/(a) and / ( yn) -------- /(a) since / is continuous at a. This contradicts |/ ( x n) f ( y n)\ > efor all n. EXERCISES 1. Prove that the sum of two continuous functions is continuous. 2. If f {x) and g(y) are continuous, prove that h(x, y) = f (x)g(y) is continuous. 3. Let/ (x) = |x| for x R3. Prove that/(x) is continuous using properties of length. 4. Graphs 209 4. (cont.) Do Ex. 3 using composite functions. 5. Prove that f i x, y) 1/ (x + y) is continuous on the open first quadrant x > 0, y > 0, but is not uniformly continuous. 6*. (cont.) Prove that f i x, y) is uniformly continuous on the domain x > 1 , y > 1 . Suppose/ (x, y ) is continuous on R2 and c is constant. Prove: 7. The set { ( x , y ) \ f ( x , y ) > c} is open (if non-empty). 8. The set {(x, y) \f(x, y) = c} is closed (if non-empty). 9. By what general principles do you know that sin (x + y) has a minimum value on the disk x2 + y2 < 1 ? 10. Let S be a closed bounded set in R3 and let x0be a point not in S. Prove there is a point in S closest to x0. [Hint: Ex. 3.] 11. (cont.) Show by example that there may be more than one point of S closest to x0. 12*. Let /(x) be continuous for all x R3. Suppose that /(x) = 0 for some x but /(0) 7^0. Prove there is a point nearest to 0 where/(x) = 0. 13. Prove that f ( x, y) = x2 6xy + 10y2 has a positive minimum value p on the circle x2 + y2 = 1 . 14. (cont.) If f ( x, y) = ax2 + bxy + cy2 has a positive minimum p on x2 + y2 = 1 , prove that/Or, y) > 0 for all (x, y) except (0, 0). [Hint: Find the minimum of f(x, y) on x2+ y2 = r2.] Let/(a;, y) e(x2+1/2): 15. Find the maximum of f i x, y ) on R2 16. Find the minimum of f {x, y) on R2 17. Is f{x, y) continuous on R2? 18*. Is f i x, y) uniformly continuous on R2? 19. Let/Or, y) be continuous on R2 and let b be a real number. Prove that g{x) = f i x, b) is continuous on R. 20. L et/(#, y) be continuous on R2. Prove that gix) = f i x, x) is continuous on R. 21*. Let git) have a continuous derivative for all t. Define f i x, y) by x y Ji x , x ) = gf ix). Prove that f i x, y) is continuous on R2. [Hint: Use the Mean Value Theorem. I t is stated on p. 285.] 4. GRAPHS Suppose a function z = f i x, y) is defined for all points i x, y) in some domain in the plane. I ts graph is the surface in space consisting of all points (*, y, f ( x, ij)), where ix, y) is in the domain of/. Figure 4.1 illustrates the graph of a function defined for all points ix, y) in a circular disk. 210 6. FUNCTIONS OF SEVERAL VARIABLES To get an idea of the shape of the surface, draw several sections by planes perpendicular to the z-axis or the y-axis. EXAMPLE 4.1 Graph the function z = f ( x, y) = 1 x2. Solution: The function f ( x , y ) is independent of y. I ts graph is a cylinder with generators parallel to the y-axis. To see this, first graph the parabola z = 1 x2 in the x } z-plane (Fig. 4.2a). I f (a, c) is any point on this parabola and b is any value of y whatsoever, then (a, b, c) is on the graph of 2 = f ( x, y). Answer: The graph is a parabolic cylinder with generators parallel to the ?/-axis. See Fig. 4.2b. EXAMPLE 4.2 Graph the function z = x2 + y. Solution: Each cross-section by a plane x = x0 is a straight line z = y + Xq2 of slope 1. The surface meets the x, 2-plane in the parabola z = x2. See Fig. 4.3a. The figure does not yet convey the shape of the surface, so look at 4. Graphs 211 Now the picture is clearer. Answer: The surface is a cylinder oblique to the x , z-plane; it intersects the x} 2-plane in the parabola z = x2. Fi g . 4.3a 212 6. FUNCTIONS OF SEVERAL VARIABLES EXAMPLE 4.3 Graph the function z = x2 + y2. Solution: Each cross-section by a plane perpendicular to the rc-axis is a parabola z = x02 + y2. This is shown in Fig. 4.4a. But from this figure, it is hard to visualize the surface. You learn more studying the cross-sections of the graph by planes z = Zo perpendicular to the 2-axis. Each cross-section is a circle x2 + y2 = z0. Answer: The surface is a paraboloid of revolution (Fig. 4.4b). Fig. 4.4b EXERCISES Sketch the graphs: 1. z = f(x, y) = 12x 3. z = x + y 5. z = (x 2 )2 7. z x3 + y 9. z = x2 + x y 11. z = xy 2. z = f ( x, y) = 3 + !/ 4. z = Sx + 2y 6. z = (y + 1)3 8. z = x2- y2 10. z x3+ yz 12. z = x2 + xy + y2. 5. Partial Derivatives 213 13. Graph the function z = -s/9 x2y2for all points (x, t/) in the circle of radius 3 and center (0, 0). 14. Graph the function z = x4 + y* for all points (x, !/) in the domain x2+ ?/2< 1, z > 0, y > 0. 15. Find the curves formed when planes parallel to the x, z-plane intersect the graph of z = 2x2+3 y2. 16. Find all points common to both the plane x = 3 and the graph of z = x2y. 17. Let/(x, y) be a continuous function defined on a closed domain D. Prove that the graph of / is a closed subset of R3. 18. Prove that the graph of /(x, y) is never an open subset of R3. 5. PARTIAL DERIVATIVES Let z = f ( x, y) be a function of two variables and a = (a, b) an interior point of its domain. Suppose we set y = b and allow only x to vary. Then f ( x , b) is a function of the single variable x, defined at least in some open interval including a. We define p (a, b) = b) ox ax This is called the partial derivative (or simply partial) of z with respect to x. (The d is a curly d.) I t measures the rate of change of z with respect to x while y is held constant. Similarly, we define the partial derivative of z with respect to y: dz . d (, b) = f ( a , y ) y=b I n like manner, given a function w = f ( x, y, z) of three variables, we may define the three partial derivatives dw/dx, dw/dy, and dw/dz. For instance, dw / 7 x d v (a, b, c) = /(a, y, c) dy dy y=b Each of the partials is the derivative of w with respect to the variable in question, taken while all other variables are held fixed. N o t e : Our definition of partial derivatives applies only at interior points of the domain. EXAMPLE 5.1 Let z = f ( x } y) = xy2. Find dz r Q dZ f A for y = 3, for x = 4, dx dy dz dz . and m general. dx dy 214 6. FUNCTIONS OF SEVERAL VARIABLES Solution: I f y = 3, then z = 9x, and dz dx y = 3 d = - (9) = 9. dx Likewise, if x = 4, then z = 4y2, and dz = J - (4i/2) = -8y. dy I n general, to compute dz/dx just differentiate as usual, pretending y is a constant: dz d{xy2) , d , , = = y2 (x) = ^2. dx dx dx To compute dz/ dy, differentiate, pretending x is a constant: dz d(xy2) d = 7---- = J " (y ) = 2xi/. ^2/ ^ dy We consider two further examples of partial derivatives. (1) The gas law for a fixed mass of n moles of an ideal gas is T P = n R - , where R is the universal gas constant. Thus P is a function of the two variables T and V : dP 1 dP _ T dT ~ n V d V ~ n V2 ' (2) The area A of a parallelogram of base b, slant height s, and angle a is A = sb sin a. The partial derivatives are dA 1 . dA . dA = b sin a, = s sin a, = sb cos a. ds db da Geometri c Interpretati on The graph of z = /(x, y) is a surface in three dimensions. A plane x = x0 cuts the graph in a plane curve x = x0, z f ( x o, y). See Fig. 5.1a. I f this curve is projected straight back onto the y , 2-plane, the graph of the function z = f ( x o, y) is obtained (Fig. 5.1b). The partial derivative 5. Partial Derivatives 215 is the slope of this graph. d l dy (*o, y) For example, suppose the graph of the function z = x + y2 is sliced by the plane x = x0. See Fig. 5.2a. The resulting curve is the parabola x = x0, z = y2 + x0. I f this is projected onto the y, 2-plane, a parabola is obtained (Fig. 5.2b). I ts slope is dz = 2 y. dy Notation Unfortunately there are several different notations for partial derivatives in common use. Become familiar with them; they come up again and again in applications. Suppose W = / ( X, y, z). Common notations for dw/dz are: fx, f x(x, y, z), wx, wx{x, y, z), Dxf. 216 6. FUNCTIONS OF SEVERAL VARIABLES For example, if then Fig. 5.2a Fig. 5.2b w = f ( x, y, z) = x*y2 sin 2, f x = 3x2y2 sin 2, wy = 2xzy sin 2, Dzf = x*y2 cos z. EXERCISES dz , dz Find and : dx dy 1. z = /(x, 1/) = x + 2i/ 3. z = 3xi/ 2x2 5. 2 = 2/+ 1 7. z = x sin y 9.2 = sin 2x + cos 3y 11. 2 = sin 2xy 2. 2 = /(x, 2/) = 3x + 42/ 4. 2 = x2y 6. 2 = x2y + x?/2 8. z y2cos x 10. 2 = sin x cos y 12. 2 = cos(2x + y) 5. Maxima andMinima 217 V 15. z = xey 17. 2 = exy 19. z = e2x sin y 18. 2: = 3ex+y 20. z = e~y cos x. 21. Let z = x2y. Find dz/dx for y = 2, and dz/dy for x = 1. 22. Let z = y2/x. Find zx for y 3. 23. Let w = xy2&. Find wx for y = 2 and 2 = 2 , find for x = 1and 2 =0 , and find wz for x = y. r x -nv j dw i ^ 24. Let w = xy xz yz. Find - - h . dx % 25. Show that 2 = (3x y )2satisfies dz/dx + 3dz/dy = 0. 26. Show that 2 = f ( x) + y2satisfies dz/dy = 2y. 27. Show that 2 = x2 y2satisfies {dz/dx)2 {dz/dy)2 = 42. 28. Show that 2 = x6xby + 7x V satisfies x (dz/dx) + y (dz/dy) = 62. 6. MAXIMA AND MINIMA Suppose Suppose z takes on its minimum value at (x, y) = (a, b). By holding y fixed at y = b, the function z becomes a function of x alone with minimum at x = a. Hence These two relations are often enough to locate the points where a function takes on its minimum (or maximum) value. See Fig. 6.1. Geometrically the idea is simple: I f the graph of z(x, y) has a high (low) point at (a, b, c), then so do its cross-sectional curves by planes through (a, b, c) parallel to the x , 2-plane and to the y, 2-plane. The slopes of these curves are dz/ dx and dz/ dy respectively. Z = z(x, y) is a function of two variables defined on the domain x0 < x < xiy 2/0 < y < y 1. Similarly 218 6. FUNCTIONS OF SEVERAL VARIABLES Fig. 6.1 In each case df/dx = df/dy = 0 at (a, b). EXAMPLE 6.1 Find the minimum of z = x2 xy + y2 + Sx. Discussion: I s the question reasonable? For a fixed value of y, say y = b, the resulting function of x , z(x, b) = x2 + (3 b)x + b2, is a quadratic polynomial whose graph is a parabola turned upward; it has a minimum. Similarly for each fixed x = a, the function of y f z( a, y) = y2 - ay + (a2+ 3a), has a minimum. This is fairly good experimental evidence that the function z( Xj y) ought to have at least one minimum, probably no maximum. Solution: The function is defined for all values of x and y. Begin by finding all points (x , y) at which both 6. Maxima andMinima 219 Now ^ (x2 - xy + y2 + 3x) = 2x - y + 3, dx dx dz d = ( X * _ X y - f- y 2 3 ^) = - ^ + 2y , dy dy so the conditions are 2x y + 3 = 0 x + 2y = 0. Solve: x = - 2, y = - 1 . The corresponding value of 2 is * (- 2, - 1) = ( - 2 y - ( 2) ( 1) + (-1)2 + 3( 2) = 4 2 +1 6 = - 3. I s this value a maximum, a minimum, or neither? (Recall that the vanish ing of the derivative of a function f ( x ) does not guarantee a maximum or minimum, e.g., f ( x ) = x* at x = 0.) I n this case you can prove that the value z ( 2, 1) = 3 is the mini mum value of z by these algebraic steps. First set up a u, ^-coordinate system with its origin at (x, y) = ( 2, 1). Take x u 2, y = v 1. Then 2 = (M- 2)2- (ti - 2)(v - 1) + (v - l ) 2+ 3(u - 2) = u2 uv + v2 3. Next, complete the square: / y\2 v2 / y\2 3 * = ( . - - ) - ? + ^- 3 = ( M- - ) + r >- 3. Since squares are nonnegative, z ( x , y ) > 3. Answer; 3. I n general, to find possible maximum and minimum values of a function, locate points where all its partial derivatives are zero. Whether a particular one of these actually yields a maximum or a minimum may not be easy to determine. (Later we shall study a second derivative test which sometimes helps.) 220 6. FUNCTIONS OF SEVERAL VARIABLES EXAMPLE 6.2 Find the rectangular solid of maximum volume whose total edge length is a given constant. Solution: As drawn in Fig. 6.2, the total length of the 12 edges is 4x + 4y + 4z. Thus 4x + 4y + 4z = 4/c, where &is a constant, so x + y + z = k. The volume is V = xyz = xy( k x y) = kxy x2y xy2. The conditions for a maximum, W dV = 0, = 0, dx dy are I ky 2 xy y2 = 0 kx x2 2 xy = 0. The nature of the geometric problem requires x > 0 and y > 0. Thus we may cancel y from the first equation and x from the second: 2 x + y = k x + 2 y = k. This pair of simultaneous linear equations has the unique solution k k 3 Hence also z = k / 3; the solid is a cube. * " 3 V = Answer: A cube. 6. Maxima andMinima 221 EXAMPLE 6.3 What is the largest possible volume, and what are the dimensions of an open rectangular aquarium constructed from 12 ft2 of Plexiglas? Ignore the thickness of the plastic. open top y Solution: See Fig. 6.3. The volume is V = xyz. The total surface area of the bottom and four sides is xy + 2 yz + 2 zx = 12. Solve for z, then substitute into the formula for V : z = 12 xy 2 (x + y ) 9 V = (12 - xy) xy 2 (x + y) Compute the partials carefully (only one computation is needed because of the symmetry in x and y ) : dV _ y2( x2 2xy + 12) dx 2(x + y) 2 dV _ x2( y2 2xy + 12) dy 2 {x + y) 2 Now find all points (x, y) where both partials are zero. Such points must satisfy y2( x2 2 xy + 12) = 0 x2( - y 2 - 2xy + 12) = 0. By the nature of the problem, both x and y are positive quantities. Therefore, the factors y2 and x2 may safely be canceled: - x 2 - 2xy + 12 = 0 y2 2 xy + 12 = 0. 222 6. FUNCTIONS OF SEVERAL VARIABLES I t is easy to solve these equations for x and y. Subtract: x2 y2 = 0, hence y = x . Since x and y are both positive, only y = x applies. Now substitute y = x into either of the two equations: 3x2 -f* 12 = 0, x = 2. The only feasible choice is x = y = 2. I t follows that 2 = 1 . The optimal dimensions are 2 X 2 X 1; the maximal volume 4 ft3. Alternate Solution: Previously we chose the side lengths as variables, because volume and surface area can be expressed in terms of these side lengths. I s this the only way? Perhaps we can use the face areas as variables. They are u = yz, v = zx, w = xy. The total surface area, in terms of u, v, w, is w -J- 2u ~f- 2v 12. Can we express the volume in terms of u, v, and wl I t is unnecessary to solve the system 4 yz = u, zx = v, xy = w for x , y, z in terms of u, v, w, and then substitute the resulting expressions in V = xyz. We can express V directly in u, v, w: V2 = (xyz)2 = x2y2z2 = ( yz) ( zx) ( xy) = uvw. The maximum value of V corresponds to the maximum value of V2. Now we have formulated a reasonable problem: Maximize uvw subject to the con straint 2u + 2v + w = 12. Set g(u, v) = uv( 12 2u 2v) = 12uv 2u2v 2uv2. Take partials: gu = I2v 4 uv 2v2y gv = 12m2 u2 4 uv. These partials are 0 where the equations 12v 4 uv 2v2 = 0 12 u 2 u2 4 uv = 0 are satisfied. Cancel the factor v in the first and u in the second (v = 0 or u = 0 doesnt hold water). The equations reduce to 2 u -j- v 6 u + 2v = 6. Hence u = v = 2. I t follows that w = 4 and F2= uvw = 16. The answer, however, should be in terms of x, y, and z. Recall yz = u = 2, zx = v = 2, xy = it? = 4. Hence x = y and xy = #2= w = 4. Therefore x = y = 2 and 2 = 1. .Answer: The maximum volume is 4 ft3, achieved by a tank of base 2 ft X 2 ft and height 1ft. 6. Maxima andMinima 223 R e m a r k : The theory of maxima and minima will be treated with more care in Chapter 9, Section 4. EXERCISES Find the possible maximum and minimum values: 1. z = 4 2x2 y2 2. z = x2 + y2 1 3 2 = ( z - 2) 2+ (y + 3)2 4. 0 = (* - 1)2+ y2 + 3 5. z = x2 2xy + 2y2 + 4 6. 2 = xy x2 2y2 + x + 2i/ 7. z = x y2 xs 8. 2 = 3x + 12i/ a;3 2/3. 9. Find the rectangular solid of greatest volume whose total surface area is 24 ft2. 10. Find the dimensions of an open-top rectangular box of given volume V, if the surface area is to be a minimum. 11. Find the dimensions of the cheapest open-top rectangular box of given volume 7, if the base material costs 3 times as much per ft2as the side material. 7. Linear Functions and Matrices 1. INTRODUCTION I n this chapter we shall discuss some parts of linear algebra that are useful in our further study of the calculus of functions of several variables. Linear algebra can be looked at as the study of one basic functional equa tion, (1) F(au + bv) = aF( u) + bF ( y), where a and b are scalars and u and v are vectors. This equation is fundamental in many branches of mathematics. We can think of F as an operation that transforms an input u into an output F( u) . Then equation (1) says that for input a(inputi) + fr(input2), the output is a(outputi) + 6 (outputs). Not every F has this property. Examples: F( x) = x2 for each real number x and F( u) = |u| for each vector u in R3. Remark: I f F satisfies (1), then it also satisfies (2) F(au + by + cw) = aF( u) + bF(v) + cF( w) (and similar equations for four or more summands). This equation follows in two steps from (1): F (au + bv + cw) = F(au + bv) + cF ( w) = [aF(u) + 6 F ( v ) ] + c F(w). An operation F satisfying (1) is called linear. I n the most general setting, the inputs (domain) of F may come from any mathematical system in which expressions au + bv make sense. The outputs may be in any other systems with the same property. Such systems are called abstract vector spaces; their study belongs to a course in linear algebra. We shall consider only linear operations with domain one of the spaces R, R2, R3and with values also in one of these spaces, not necessarily the same one. We shall call such operations linear functions. 1. Introduction 225 Real -Val ued Linear Functions Let us consider a linear function F defined on three-space R3, with real values. Thus F: R3--------- > R, so F is a function of three variables of the type studied in Chapter 6. However F is subject to an important restriction: it must satisfy (1 ). Now what does F look like? From previous experience, we would guess that F(x, y, z) = px + qy + rz is a reasonable possibility. Let us verify that this function is indeed linear. We set Xi = (#i, 2/1, zi) and x2= (x2, 2/2, 22). Then F( ax 1 + bx2) = F( ax 1 + bx2, ayi + by2, az\ + bz2) = p( ax 1 + bx 2) + g(a2/i + by2) + r(azi + 622) = a(px 1 + g?/i + r2i) + b(px2 + qy<i + rz2) = aF (x 1) + bF (x 2), so F does satisfy (1). I t is important that the converse is also true. Theorem The most general linear function F : R3-------- >R is given by F(x, y, 2) = px + qy + rz, where p, g, r are constants. Proof: We use the decomposition of an arbitrary vector into a combina tion of the basic unit vectors: (x, y, 2) = x\ + 2/j + 2k. From (2 ) it follows that F(x, y, 2) = F( xi + 2/j + k) = #F(i) + 2/F(j) + z F( k). Set p = ^(i ), q = F( j ), r = F(k), three scalars which do not depend on (x, y, 2). We have proved that F(x, y, z) = px + qy + rz. R e m a r k : The proof illustrates an important principle: a linear function F: R3-------- >R is completely determined by its effect on the three vectors i, j, k. As a corollary of the preceding theorem, we obtain a useful alternative representation for linear functions. 226 7. LINEAR FUNCTIONS AND MATRICES Theorem Let F: R3----- >R be linear. Then there is a vector p such that F{x) = p*x. Proof: J ust observe that F{x) = px + qy + rz = (p, q, r)* Or, y, z) = p-x, where p = (p, q, r). An interesting consequence is the following estimate on the order of magnitude of a linear function. I t will be useful in the next chapter when we study composite functions. Theorem Let F( x) = p*x. Then I T O I < Ip I |x |. Proof: Given x, let 0 be the angle between p and x. Then p*x = |p| |x| cos 0, hence \F(x)\ = |p*x| = |p| |x| |cos0| < |p| |x|. R e m a r k : I n some discussions it is common to call a function of the form f ( x, y, z) = Ax + By + Cz + D a linear function (A, B , C, D are constants) or a linear polynomial. I f D = 0, then f ( x, y, z) is called a homogeneous linear function. I n this chapter, we shall always mean homogeneous when we refer to linear functions. EXERCISES Let F: R3-------- >R be linear: 1. Find F(x, y, z) if F( 1,0,0) = 3, F(0, 1,0) = 2, F(0, 0, 1) = - 5. 2. Find F(x, y, z) if F( 1, 1, 0) = 0, F(0, 1, 1) = 1, F(l, 1, 1) = 3. 3. Show that the set of x for which F ( x ) = 0 is a plane through 0, provided Ft * 0. 4. Prove that F ( x ) is determined by its values at i, i + j, and i + j + k. 5. Prove that F ( x ) is not determined by its values at i and j . 6. Prove that F(x) is continuous. 7. Prove that the sum of two linear functions is linear. 8. Prove that cF is linear if F is linear and c is a scalar. 9. Let g vary over continuous functions on [0, 1] and set L(g) = J g(t) dt. Show that L is linear. 2. Linear Transformations 227 10. Let g vary over polynomials and set D(g) = g', the derivative. Show that D is linear. 11. Prove the identity (ciiXi + a2x2 + )2+ (a\X2 a2xi)2 + (a2xs azx2 )2+ (a3x i ai3)2 = (ai2+ a22+ a32) fa2+ z22+ z32). 12. (cont.) Hence give another proof, free of cosines, that |a x| < |a| |x|. Let F: R3-------- >R. Call F an affine function if F(au + 6v) = aF(u) + bF(v) whenever a + b = 1. 13. Prove that each non-homogeneous linear function is an affine function. 14*. Let F be an affine function. Prove that F (au + 6v + cw) = aF (u) + bF (v) -f- cF (w) if a + 6+ c = 1. 15*. (cont.) Prove that F is a non-homogeneous linear function. 2. LINEAR TRANSFORMATIONS I n this section we study linear functions F: R3-------- >R3. (Linear functions taking values in a space of dimension greater than one are often called linear transformations.) Let us begin by describing the most general function of R3 into R3, linear or not. I f F: R3-------- >R3, then for each x R3, F( x) = ( F , ( x ) , F , ( x ) , F , ( x ) ) . Thus F is equivalent to a triple Fh F2, Fz of real-valued functions called coordinate functions associated with F. Now, if F: R3-------- >R3is linear, we can show that each coordinate func tion Fj\ R3-------- >R is linear. We write the basic equation F(au + 6v) = aF( u) + bF(v) in coordinates: (Fi(au + by), F2(au + by), Fz(au + by)) = a(Fi (u) , F2(u), F3(u)) + b( Fx(y), F2(v), f t(v)). Now equate the/-th coordinates: Fj (au + by) = aFy(u) + 6Fy(v). Therefore F}: R3-------- >R and Fj is linear. By the second theorem of Section 1, there is a vector p; such that Fj (x) = py* x. 228 7. LINEAR FUNCTIONS AND MATRICES Theorem Let R3 such that -> R3 be linear. Then there are vectors pi, p2, P3 F( x) = (pi -x , p2 -x, p3* x ). A Bound for F By analogy with our procedure in Section 1, we may use the preceding theorem to obtain a bound for | F(x )| . Theorem L etFi R3 c such that -> R3 be a linear function. Then there is a constant | F(x )| < c | x | . Proof: We have F( x) = (pi -x , p2-x, p3* x) and we know that |py#x| < |py| |x|. Therefore | F(x )| 2 = |Pl -x |2 + |p2-x |2 + |p3-x |2 < |pi |2 |x|2 + |p2|2 |x|2 + |p3|2 |x|2 = c2 |x|2, where c2= |pi |2 + |p2|2 + |p3|2. Hence | ^ (x )| < c |x|. The Matrix of F Again consider a linear function F: R3 -------- >R3. Let F(x ) = (pi *x, p2-x, p3*x ). Each vector py has three components: Pi = (pi, qi, r i ) P2 = (P2 , q%) r 2) .Pa = (ps, Qsf rz). The function F is specified by the three vectors pi, p2, p3, and each of these is specified by its three components. Therefore F is determined by the square array of 9 numbers *~Pi qi ri A = p2 q2 r2 Jpz qz r3_ 2. Linear Transformations 229 Such an array is called a 3 X 3 matrix. I n this way each linear function F: R3-------- >R3gives rise to a 3 X 3 matrix A. We have seen that the rows of the matrix A are the vectors Pi, p2, p3written in components. There is also an interesting interpretation of the columns of A. Recall that F( x) = (pi *x, p2*x, p3*x ). I n particular F( i ) = (pi *i , p2*i , p3*i ). Now Pi = pii + P2J + p3k, so pri = pi. Similarly, p2*i = p2 and p3*i = p3. Hence, F ( i) = (ph p2, p3). I f this vector were written as a column ~Pi~ P2 _P3_ instead of a row, it would match the first column of A. Similarly we find that F (j) and F( k) correspond to " V n ^2 and r2 _^3_ _r3_ Let F: R3-------- R3be a linear function. Then F determines a 3 X 3 matrix A whose columns represent the vectors F ( i), ^(k) respectively. Conversely, each 3X 3 matrix A determines a linear function F: R3-------- R3as follows: I f Ci, c2, c3are the vectors represented by the columns of A, then F is determined by F(i ) = Ci, F( j ) = c2, F ( k) = c3. Geometric Interpretati on Let F: R3-------- >R3be linear, so F assigns to each point x in 3-space another point F (x ) in 3-space. Thus F can be looked at geometrically as a mapping of space into itself. EXAMPLE 2.1 Describe geometrically the transformations (a) F{ x , y, z ) = (x, y, 0) (b) F(x, y, z) = (x, 2y, z). I n each case, find the associated matrix. 230 7. LINEAR FUNCTIONS AND MATRICES Solution: (a) The transformation assigns to xi + y) + zk the vector .ri + y\. I n other words F projects the vector x perpendicularly onto the x, 2/-plane. See Fig. 2.1a. Clearly F (i) = (1,0,0), F (j) = (0,1,0) and F ( k) = (0, 0, 0). The corresponding column vectors are the columns of the associated matrix: "1 0 0" 0 1 0 . 0 0 0 (Note position of axes.) F i g . 2.1 (b) The transformation doubles the ^-component of x. Geometrically this amounts to a stretching of space by a factor of 2 in the (/-direction. See Fig. 2.1b. We note that F(i) = (1,0,0), F(j ) = (0,2,0), F(k) = (0, 0, 1), so the associated matrix is "1 0 0 0 2 0 . 0 0 1 2. Linear Transformations 231 Answer: (a) (b) i projection onto the x , y-plane; stretching parallel to the y-axis by a factor of 2; i H - oo 1 1 o 0 r H 1 0 1 0 > 0 2 0 * oo0 1 0H - * 1 EXERCISES 1. Let F be linear with matrix A = [a;;]. Prove that |F (x)| < c |x| with c2= ^ cuf. 2. Let F and G be linear with respective matrices A and B. Prove that F + G is linear and find its matrix. 3. Let a be a vector. Prove that F(x) = a X x is linear. 4. (cont.) Find the corresponding matrix. Show that F is not linear: 5. F(x, y, z) = (x, y, 1) 6. F(x, y, z) = (xy, 0, 0). Find the matrix corresponding to the linear function: 7. F(x, y, z) = ( x, y, x + y + z) 8. F( x, y, z) (2x y Sz, 4x + by + 6z, y 2z) 9. F ( i ) = 0, F(j) = i + j + 2k, F( k) = 4k 10. F(\) = l , F( i ) = k, F( k) = \. For each matrix describe geometrically the corresponding linear function: "1 0 0 0 0 0" 11. 0 1 0 12. 0 0 0 .0 0 1- -0 0 CL 1 0 0" "0 0 O' 13. 0 0 0 14. 0 0 0 -0 0 1- -0 0 1- _ 1 0 0 1 0 0 15. 0 - 1 0 16. 0 - 1 0 _ 0 0 -1_ _ 0 0 1- "1 0 0" cos a: sin a 17. 0 1 0 18. sin a cos a -0 0 _ 1- 0 0 232 7. LINEAR FUNCTIONS AND MATRICES "2 0 0" 0 1 0 19. 0 3 0 20. 1 0 0 -0 0 1. -0 0 1- Let F: R3- >R3be linear: 21. If F (i), F (j), F (k) all lie in a plane P passing through 0, show that F (x) lies in P for every x R3. 22. If F(i), /'"(j), F(k) all lie on a line L passing through 0, show that F ( x ) lies on L for every x R3. 23. Let F: R3 -------- R3 be linear and let a R3. Prove that x -------->a F(x) is linear on R3 to R. 24. (cont.) Suppose F: R3 - ->R3and x Prove that F is linear. ->a F (x) is linear for each a R3. 3. MATRIX CALCULATIONS I n Section 2, we saw the advantage of expressing vectors not only as row vectors, a = (ai, a2, as), but as column vectors ^ai a' = a2 3 We shall denote vectors in column form by primed letters. I n this section we introduce various kinds of operations involving rows, columns, and matrices. These facilitate the study of linear functions in Section 4. Row-by-Col umn Multipli cati on f The product of a row vector by a column vector is the scalar defined by ab' = (ai, a2, a3) CLlbl + &2&2~h I t is an alternative form of the dot product of two vectors. Note that this product is taken with the row first, then the column. 3. Matrix Calculations 233 Now let "fen 612 &13 B = & 21 & 22 &23 _&31 &32 &33, be a 3 X 3 matrix and a = (ai, a2, as) a row vector. We can multiply 5 on the left by a, taking the product of a with each column of B : "bn bn bis aB (0 1 , ci2><23) b2i b22 b23 _bsi bs2 bs3m (a i & l l + 02& 21 + 3& 31, Glfrl2 + a2& 22 + & 3& 32, ^1613 + 2&23 + Os b s s ) Similarly we can multiply B on the right by a column vector c', taking the product of each row of B with c': bn & 12 bis Ci bnCi + bi2c2 + 613C 3 B e ' = 6 2 1 b22 b2s c2 = b2iCi + b22c2 + b2sCs _b 31 bs2 bss^ _ C 3_ J>3iCi + bs2c2 + bssCs_ The answer is a row vector in the first case, a column vector in the second. Examples: 1 2 0" - 1 , 1 ) - 1 - 1 2 = 1 (8, 5, - 3), 4 - 2 - 1_ 1 2 0 3 " 1 - 1 - 1 2 - 1 = 0 4 - 2 - 1 1 13 < Sometimes it is convenient to abbreviate the notation. We can write B as a row of three column vectors: "6 n 612 bis B = (b/, b2', b3'), where b / = b2i , b2' = b22 , b3' = b2s _bsi__ _b&_ J) 33_ 234 7. LINEAR FUNCTIONS AND MATRICES Then aB is a row vector: aB = a( b/ , b2', b3') = (ab/, ab2', ab3'). Each a by' is a scalar, a row-by-column product. Similarly we can write B as a column of three row vectors: b i = ( b n , &12, biz) b2 = (>21, &22>>23)- b3 = (&31, ^32, >33) Then 5c' is a column vector: "bi" B = b2 , where _ b 3_ ~br biC'- Be = b2 c' = b2c' b3_ b3c'_ f l o w X Matrix X Column Let a be a row vector, B a matrix, and c' a column vector. Then Be' is a column vector, so we can form the scalar a (Be'). We can also form the scalar (aB ) c ' . It is an important fact (an associative law) that these two are equal: a (Be') = (aB)c'. To prove it, we simply compute the product in either order. With the notation above, a (Be') = (0 1 , a2, a3) ^ bijCj b%jCj ^ bsjCj ^ ^ bijCj ^ dibijCj. i=i y=i *,/ The double sum contains 9 summands, and it does not depend on whether we 3. Matrix Calculations 235 sum first on j or on i. I f we sum first on i , then on j , we obtain a (Be' ) = ^ CLibijCj = ^ ^ Cy = (aB)c'. I f a is a row vector, c' is a column vector, and B is a matrix, then a( Bc' ) = (aB)c' = ^a* btycy. Product of Matri ces I t is time to define the product of two 3X 3 matrices A and B. As you might suspect by now, we shall define AB in terms of row-by-column products. Let fiy r2, r3be the rows of A and let c/, c2', c3' be the columns of J5, so A = and B = (c/, c2', c3'). We define ~rC riCi' Ti C2' TlC/ AB = r2 (Cl', Ci , c3') = r2Ci' r2c2' r2c3' _r3_ _r3Ci' r3c2' r3c3'_ Thus the product AB is another 3X 3 matrix. The number in the i }j -th posi tion of AB is TiC/y the product of the z-th row of A by the^-th column of B. Example: C = 2 0 - 1 3 - 3 - 3 "( 2, o, - 1 ) - 1 3 - 2 1 2 1 = ( - 1 , 3, - 2 ) 2 4 4_ _5 1 4_ _( 2, 4, 4) - 3 2 1 - 3 1 - 4 236 7. LINEAR FUNCTIONS AND MATRICES Then cu = (2,0, - 1) 1, C32 (2, 4, 4) 2 1 - 3 = 6, etc. and we find 1 - 7 - 2 ' c = - 1 0 7 14 30 6 - 1 8 W a r n i n g : Multiplication of matrices is not commutative'. I n general, AB BA. For example, multiply the matrices of the example above in the opposite order: " 3 - 3 3" 2 0 - 1 15 3 15 1 2 1 - 1 3 - 2 = 2 10 - 1 _5 1 4_ 2 4 4_ _ 1 - 1 3 23_ The product is totally different. Despite the failure of the commutative law for multiplication of matrices, the associative law does hold. Associative Law I f A, B, C are 3X 3 matrices, then A( BC) = {AB)C. The proof is given in the next section. Transpose The transpose A' of a matrix A is obtained by flipping A across its main diagonal (mirror reflection in the main diagonal; interchange of rows and columns): a b c d a n 12 O13 / an 021 O3I / a c = b J 021 &22 023 ~ aw 022 O32 _&31 O32 d33_ _dl3 023 d33_ We also define transpose for row and column vectors: 3. Matrix Calculations 237 a a b y b _c_ _c_ I n any case, if the transpose operation is performed twice in succession, we return to where we started: ( x') ' = X, ( A ') ' = A , [ ( y = y'. There is an important connection between matrix multiplication and transposing. Let x and y be row vectors and A and B matrices. Then xy' = yx', (xA)' = A'x', (Ay')' = yA', xAy' = yA'x', ( AB) ' = B'A'. The proofs of these relations are straightforward and are left as exercises. Further Operations Sometimes it is necessary to add two 3X 3 matrices or multiply a matrix by a scalar. The definitions of these operations are quite natural: an 12 dl3 bn >12 >13 an + >n #12 + >12 13 + >13 21 22 &23 + >21 >22 >23 = a 21 + >21 22 + >22 &23 f" >23 _&31 &32 33_ _>31 >32 >33_ _31 + >31 32 + >32 33 + >33_ an au 13 can can cau 21 22 23 = ca2i ca22 ca23 _d31 32 33_ jca^i ca$2 caz 3_
W a r n i n g : I n the second definition, each one of the 9 entries of the matrix
is multiplied by c. This is different from the corresponding rule for deter
minants, where multiplying by c is equivalent to multiplying just one row or
one column by c.
R e m a r k : I t should be clear that everything in this section applies not
only to 3 X 3 matrices, but also to 2 X 2 or n X n matrices as well.
EXERCISES
Compute:
3. (2,0,8)
C I ]
1
1
LlJ
7.
0 1 1'
1 0 1
Ll 1 0J
9. (3,1,1)
1"
1
Ll.
1 1 1
1 1 1
Ll 2 3J
- 1
- 1
L 2J
15.
2 1
- 1 2
L -2 - 2
1
-1
1J
0 3
1 1
L1 - 1
17.
1 0 0"
0 0 0
LO 0 0.
CL\ 0*2 CLz
6l &2
Lei C2 C3.
4. (0,4,0)
1'
0
LlJ
V
j /J
- 2 0 o'
0 3 0
L 0 0 4J
10. (2, 0, - 1)
L cJ
- 1
1
L 1 1 - I J
" 2"
0
. - I -
14.
1
- 1
3 0 1'
- 1 3 2
L2 4 5J
1 1'
-1 - 1
1 0 5*
0 1 0
LO 0 U
0 0 0.
~b 1 0
16.
18.
0 b 1
LO 0 bJ
0 1 O'
1 0 0
LO 0 1.
ai 0*2 ciz
bi 62 h
-Cl C2 C3-
4. Applications 239
3
~a 0 0
19. 0 0 c 20. 0 b 0
-0 0 0- -0 0 C -
21. (x, y, z)
1 0 O'
0 2 0
L O 0 3J
Compute A2+ 3A:
L zJ
23 A =
- 3
0
j
22. (x, y, z)
1 2 O'
2 3 0
LO 0 4J
24. A =
1 2 O'
0 - 3 1
Ll 1 - I J
25. Let 0 be the 3X 3 matrix all of whose entries are zero. If A and B are 3X 3
matrices and AB = 0, does it follow that either A = 0 or B = 0?
Prove
26. xy' = yx'
28 (Ay')' = yA'
30. (AB)' = B'A'
27. (xA)' = A'x'
29. xAy' = yA'x'
31. (A + B)' = A' + B'.
4. APPLICATIONS
We return to linear functions of R3into R3and apply vector and matrix
techniques. I t will be convenient to write all vectors in columns. Thus we
shall think of R3as the set of all column vectors with three elements.
Suppose F: R3-------- >R3 is linear. According to Section 2, there exist
vectors pi, p2, P3 such that F( x) = (pi* x, p2*x, p3*x). But we were going to
use column vectors instead of row vectors, so we re-write this formula as
follows:
~Pix'~
F(x') = P2X
.P3X J
where pi, p2, P3 are row vectors. Recall that in Section 2 we associated with F
the 3X 3 matrix A whose rows are Pi, p2, p3. Therefore the last formula is
simply F( x' ) = Ax'.
240 7. LINEAR FUNCTIONS AND MATRICES
F( x' ) = Ax'
for each column vector x'.
R e m a r k : The bound |F(x')| < c |x'| that we derived before can now be
written for the product Ax' of a matrix by a vector:
|Ax'| < c |x'|,
where c is a constant independent of x'.
The formula F( x' ) = Ax' provides a practical algorithm for computing the
values of a linear function.
EXAMPLE 4.1
Let F: R3-------- >R3be linear and satisfy
"1 4" ~0~ 7" "0" - 1"
F( 0
) =
5
>
1
) =
12
, F(
0
) =
2
_0_ _ 2_ _0_ _ 0_ _1_
1_
Compute F (
Solution: Let A be the matrix of F. I n Section 2, we showed that the
columns of A are f
F(
"1"
i
0
1\
0
1
0
) . n
i
) , n
0
_0 _ _o_ _1 _
(only there we used the notation F ( i), F(j ), /^(k)).
Consequently
4 7 - 1 2
A = 5 12 2 ,so F( 7
) =
2 0 1 _ - 3
4 7 - 1
5 12 2
- 2 0 1
2
7
- 3
60
4. Applications 241
A nswer: 88
7_
Composition of Linear Functions
Let F and G be linear functions with corresponding matrices A and B :
F( x' ) = Ax ', <?(x') = Bx'.
As usual, the composite function F G is defined by
(F G) (x') = F[G(x')].
Theorem The composite of linear functions is again linear. I f F( x) = Ax'
and G( x) = Bx', then
(F G) (x') = ( AB) x' .
Proof: To prove that F G is linear, we go right back to the definition:
F[G(au' + 6v ' )] = F[_aG( u ' ) + ? ( v ' ) ] = aF[_G( u ' ) ] + 6F[ (? (v ' )] ,
the first equality because G is linear, the second because F is linear. I t follows
that
{F G) (au' + by' ) = a(F (?)(u') + b( F G) ( v' ) ,
that is, F G is linear.
We must compute the matrix corresponding to F G. We have
(F oG) (x') = F[G(x')] = F [ B x ' l = A( Bx' ) .
To finish the proof, we must establish the associativity relation
A( Bx ' ) = ( AB) x' .
I f ai denotes the t-th row of A, then the i-th element of the column vector
A( Bx ' ) is ai (Bx' ). But we know that ai (Bx' ) = (ai B)x' . However, aiB is
the z-th row of the matrix AB. Consequently A (Bx' ) and ( AB) x' have the
same elements, i.e., they are equal, A( Bx ' ) = ( AB) x' . This completes the
proof that ( F G) ( x ' ) = ( AB) x' .
Here is a concrete application of composition of linear functions, showing
how matrix multiplication can be a labor saving device. Consider this problem
in algebraic manipulation: Given
(1)
r = 3x + + 5z
s 2x u z
(2)
t = 5x 6y + 3z
express r, s, t in terms of u, v, w.
x u + v +
y = 3u 2v w,
z = 6u + 5v + 6w
The obvious approach is to substitute equations (2) into equations (1),
simplify and collect terms. But this is tedious, while matrix multiplication will
do it for you automatically. Write
r X X u
s = A
y y y
= B V
t z Z w
where
3 4 5 "1 1 4"
A = 2 - 1 - 1
, B =
3 - 2 - 1
_5 - 6 3_ _6 5 6_
Then
r u 45 20 38 u
s = ( AB)
V
=
- 7 - 1 3 V
_ t _ _w_ 5 32 44_ _w_
Hence
r r = 4 5u + 20v + 38w
\ s = 7u v + Sw
J = 5u + 32v + 44w.
Associative Law
We can now prove, with a minimum of computation, the associative law for
multiplication of matrices given in Section 3.
Let A, B, C be 3 X 3 matrices and let F, G, H be the corresponding linear
functions:
F(x') = Ax', Cr(x') = Bx', H (x') = Cx'.
We compute in ^wo ways. First, C?[77(x')] = ((? H) (x'), so
F{G[tf(x')]) = F [ ( G f f ) ( x ' ) ] = [ F* (<7ff)](x').
Second, F[G(y')] = (F 6) (y'). Replace y' by H( x' ) :
F{G[ff(x')]} = (F 0( ? P ( x ') ] = [ ( F 0f f ) ^] ( x ') .
We conclude that
F (G H) = (F G) H.
Now we translate this into matrices. The matrix corresponding to G H is
4. Applications 243
BC, hence the matrix corresponding to F (G H) is A (BC). Similarly the
matrix corresponding to the linear function (F G) H is (AB) C. Therefore
A( BC) = (A)C.
1. Let F : R3- R3be linear. If
"1 6" 0 - 4" "0 8"
F( 0
) =
2
F(
1
) =
3
n
0
) =
- 1
- 0- -7_ - 0- - 0- _ 1_ - 3-
compute
F(
5'
- 1
2. Given a 3 X 3 matrix A, verify that x'
F: R3-------- >R3.
A x ' defines a linear function
Let F and G be linear functions R3-
Describe F G:
R3with associated matrices A and B.
0 0 1" 2 0 0
C
O
I
I
0 1 1
I
I
o
q
3 0 0
. 0 0 0 - - 5 1 0 -
[
cos 0 sin 0~1 cc
sin 0 cos 0 J [_si
cos 0 sin 0"
; interpret geometrically.
-sin0 cos 0j |_sin0 cos0
5. Use matrix and vector notation to write the linear system
a\x + b\y + c\z d\
azx + b2y + c2z = d2
kdzX + bty + C3z = d3.
6. Let F: R3-------- >R3be linear with matrix A. Given y', prove that there is a
unique vector x' such that F ( x ' ) = y' if and only if the determinant of A is not
zero.
7. Given
x Su + 4v + 5w
y = u v + 2w
z = 2u + 2v w
express x, y, z in terms of r, s, t.
u = r s t
v 2r + 6s 7t
w = 3r 2s + t}
244 7. LINEAR FUNCTIONS AND MATRICES
8. Given
ix = 2u 3v u = 4r 5s fr = 5p + 2q
y = r + 2s [s = p + 3q, [y = 6w+ v
express x and y in terms of p and q.
9. Do Ex. 4 of Section 2 in column notation. That is, prove
0
03 2
a' X x' = Ax' , where A = 0
01
_&2 ai 0 .
10. (cont.) Use this result to prove
a ' - ( b ' X c ' ) = c'
X
f
i
t
).
11. Use Ex. 9 to show that the matrix of
F(x ' ) = a ' X ( b ' X x ' ) is
- ( a'1
b' )I + b'a
12. (cont.) Now prove the identity
a ' X ( b ' X c ' ) = - ( a ' - b ' ) c ' + (a' -c ' )b' .
(Compare Ex. 21, p. 147.)
13. Let A be a 3 X 3 matrix. We proved there is a constant c such that |Ax'\ < c |x'|
for all vectors x'. Prove that | A nx'| < cn |x'| for n = 1, 2, 3, .
14*. (cont.) Prove that
j
n =0
n\
Anx'
converges for each x'. [Hint: Use Ex. 23, p. 205.]
15*. (cont.) Prove that
lh"
n =0
is a linear function. (It defines a matrix called eA or exp A.)
5. QUADRATIC FORMS
This section paves the way for the study of maxima and minima of func
tions of several variables in Chapter 9. I n one variable calculus, for f ( x ) to
have a minimum at a point c where /'(c) = 0, a sufficient condition is that
/"(c) > 0. I n several variables, the analogous second derivative test is more
complicated, and requires some knowledge of quadratic forms.
A quadratic form is simply a homogeneous quadratic polynomial. The
general quadratic form in two variables is
f ( x , y ) = ax2 + 2 bxy + cy2.
We note immediately a useful expression for / in terms of matrices:
5. Quadratic Forms 245
f i x, y) = (x, y)
a b
b c
This expression justifies using 2b instead of b for the middle coefficient.
The general quadratic form in three variables is
f {x, y, z ) = ix, y, z )
a b d
b c e y
d e / J
= ax2 + 2 bxy + cy2 + 2 dzx + 2 eyz + f z 2.
The matrices
a b
b c
and
a b d
b e e
J ' e /_
have mirror symmetry in the main diagonal (top left to bottom right) and so
are called symmetric matrices. This property is concisely stated in terms of
transposes:
A matrix A is symmetric if and only if A ' = A.
To each symmetric matrix corresponds a unique quadratic form. Con
versely, to each quadratic form corresponds a unique symmetric matrix. This
converse statement may seem obvious, but it does require a proof, which we
give for the two variable case.
Suppose
ai bi d2 b%
Ax = and A 2=
bi ci b 2 Ci
determine the same quadratic form. Then
aix2 + 2b\xy + c\y2 = a2x2 + 2 b2xy + c2y2
for all x and y. The choices (x, y) = (1, 0), (0, 1), and (1, 1) yield
a\ a2) C\ c2, b\ = b2.
Hence A\ = A 2. A similar easy proof applies for three variables.
246 7. LINEAR FUNCTIONS AND MATRICES
The general quadratic form may be written as x.t x',
where A is a unique symmetric matrix.
Examples of quadratic forms:
x2 + y2 = (x , y)
1 0
0 1
2xy = (x, y )
0 1
1 0
ax2+ by2 + cz1 = (x, y, z )
a 0 0
0 6 0
0 0 c.
0 c b
2cxy + 2ayz + 2bzx = (x, y, z) c 0 a
6 a 0
Positive Defi nite Matrices
Take a quadratic form in two variables,
f i x, y) = x.4x' = ax2 + 2 bxy + cy2,
where
x = (x, y) and A =
a b
b c
We want to find conditions on the coefficients a, b, c so that
f i x, y) > 0 for all {x, y) ^ (0, 0).
When this is the case both / and A are called positive definite.
The simplest case of a positive definite quadratic form is f i x, y) =
The associated positive definite matrix is
1 01
A =
0 1
Other examples:
1 1
f i x, y) = x2 + 2xy + 5y2 = ix + y )2+ 4y2, A =
-1 5
1
f i x, y) = x2 6xy + 11 y2 = ix 3y) 2 + 2y2, A -
- 3
x2 + y2.
-3
11
To check that (x 3y )2 + 2y2is positive definite, suppose (x } y) 9^ (0, 0).
I f y 0, then
(x - 3y) 2 + 2y2 > 2y2 > 0.
I f y = 0, then x ^ 0 and
(x 3y) 2 + 2y2 = x2 > 0.
We are ready to state the basic facts about positive definite matrices. We
begin with the 2X 2 case:
5. Quadratic Forms 247
Theorem 5.1 The symmetric matrix
a b
b c
is positive definite if and only if
a > 0 and
a b
b c
> 0.
Proof: Suppose the matrix is positive definite. Then
f ( x yy) = ax2 + 2bxy + cy2 > 0
whenever (x , y) ^ (0, 0). I n particular a = /(l , 0) > 0. Now complete the
square:
f ( x , y ) = a ( x + V2-
T hen/( 6/a, 1) > 0, hence
ac b2
--------- > 0.
a
Therefore a > 0 and ac b2 > 0.
Conversely, suppose a > 0 and ac b2 > 0. Then certainly
x / b V (ac - b 2\
f ( x, y) = a l x + - y ) + I - J y2 > 0
for any (x, y). Furthermore,/^, y) = 0 only if each of the squared quantities
is zero:
b
x + - y = 0, y = 0,
a
hence only for (x, y) = (0, 0). This completes the proof.
248 7. LINEAR FUNCTIONS AND MATRICES
There is a corresponding theorem for 3 X 3 symmetric matrices. A proof
based on completing the square is possible, but it is long and tedious, so we
shall omit it. I n linear algebra courses other types of proofs are often de
veloped. (Compare the remark after Theorem 8.8.)
Theorem 5.2 The symmetric matrix
a b d
b e e
_d e /_
is positive definite if and only if
a b d
a > 0,
a b
b c
>0, b e e
d e f
> 0.
Negat i ve Defi nite Matri ces
Let A be a symmetric matrix and/(x) = xAx' the corresponding quad
ratic form. We say that A and / are negative definite if / ( x ) < 0 except for
x = 0. I t is pretty clear that/(x) < 0 is the same thing as / ( x ) > 0, hence
A is negative definite if and only if A is positive definite, where A denotes
the matrix whose elements are the negatives of the elements of A (it is the
matrix of /). A corresponding modification of Theorem 5.2 gives the fol
lowing test for negative definiteness:
Theorem 5.3 The symmetric matrix
a b d
b e e
j d e f _
is negative definite if and only if
a b d
a b
a < 0, >0, b e e < 0.
b c
d e f
The corresponding conditions for a quadratic form ax2 + 2bxy + cy2 in two
variables are simply a < 0, ac b2 > 0.
5. Quadratic Forms 249
A Lower Bound
The following result is essential in our discussion later of maxima and
minima. We prove it for three variables, but an immediate modification
covers the two variable case.
Theorem 5.4 L et/ ( x ) be a positive definite quadratic form. Then there
exists a constant k > 0 such that
/ ( x ) > k |x|2
for all x.
Proof: Let S = {x | |x| = 1} denote the unit sphere, a closed and
bounded set. The function / is continuous on S, so it has a minimum on S.
That is, there is a point x0 S such that /(x) > /(x0) for all x S. We set
Jc = f ( x 0). Since/i s positive definite, we have k > 0.
Let x^O. Then x/|x| S, hence / (x/|x|) > k. B ut/(x/|x|) =/(x)/|x|2
si nce/i s a quadratic form. Therefore/ (x)/|x|2> h, and finally/(x) > k |x|2.
Semi-definite Forms
Positive definiteness is defined by /(x) > 0 for all x ^ 0. A quadratic
form/(x) is called positive semi-definite if
/(x) > 0 for all x.
The property negative semi-definite is defined similarly. We use the same
adjectives for the symmetric matrix corresponding to the form/(x).
Theorem 5.5 The quadratic form /(x) = ax2 + 2bxy + cy2 is positive
semi-definite if and only if
a > 0, c > 0, ac b2 > 0.
Proof: Suppose/(x) is positive semi-definite. Let h > 0. Then
/ ( x ) + h( x2 + y2) = (a + h) x2 + 2 bxy + (c + h) y2
is positive definite. By Theorem 5.1,
a + h > 0, (a + h) (c + h) b2 > 0.
Also c + h > 0 as we see, for instance, by interchanging x and y. Now let
h -------- >0. Then we conclude that
a > 0, c > 0, ac b2 > 0.
Conversely, suppose these three inequalities hold. Let h > 0. Then
a + h > 0, c + h > 0, and
(a + h)(c + h) b2 = (ac b2) + (a + c)h + h2 > h2 > 0.
250 7. LINEAR FUNCTIONS AND MATRICES
Hence (a + h) x2 + 2bxy + (c + /i)?/2 is positive definite. Let (x, y) 9^ 0.
Then
(a + h) x2+ 2bxy + (c + h) y2 > 0.
Let h -------- >0:
/(x) = arc2+ 2bxy + cy2 > 0.
Therefore/(x) is positive semi-definite.
R e m a r k 1: An alternate proof can be constructed by modifying the proof
of Theorem 5.1.
R e m a r k 2 : The corresponding conditions for /(x) to be negative semi-
definite are
a < 0, c < 0, ac b2 > 0.
We shall state without proof the corresponding results for three dimensions.
Theorem 5.6 The symmetric matrix
a b d
A = b e e
d e f .
is positive semi-definite if and only if all of its principal minors are non
negative, that is,
a > 0, c > 0, / > 0,
ac - b 2 > 0, cf e2 > 0, af - d2 > 0,
a b d
\A\ = b e e > 0 .
def
The conditions for A to be negative semi-definite are
a < 0, c < 0, / < 0,
ac 62> 0, cf e2> 0, af d2 > 0
|A| < 0 .
For instance, consider
A =
1 1 1
1 1 1
1 1 1
5. Quadrat i c Forms 251
Then a = c = f = I > 0, ac b2 = cf e2 = af d2 = 0 > 0, and \A\ =
0 > 0, so A is positive semi-definite. If x = (x, y, z), note that/(x) = xAx' =
(x + y + z)2 > 0, confirming the diagnosis. (But A is not positive definite;
for instance/(l, 1, 2) = 0.)
Indefinite Forms
A quadratic form/(x) = xAx' is called indefinite if it is neither positive
nor negative semi-definite, that is, if there are points Xi and x2 such that
/(xi) < 0 and /(x2) > 0.
Theorem 5.7 A form /(x) = <ax2+ 2bxy + cy2 in two variables is in-
definite if and only if
a b
< 0.
b c
Proof: First suppose ac b2 < 0. By Theorem 5.5, /(x) is not positive
semi-definite. By the corresponding result mentioned in Remark 2 above, / (x)
is not negative semi-definite. Therefore/(x) is indefinite.
Now suppose/(x) is indefinite. We examine three cases: (1) a > 0, c > 0.
Then ac b2 < 0, otherwise /(x) is positive semi-definite by Theorem 5.5.
(2) a < 0, c < 0. Then ac b2 < 0, otherwise/(x) is negative semi-definite.
(3) ac < 0. Then ac b2 < ac < 0. This exhausts all possibilities; the proof
is complete.
There is not such a simple criterion for indefiniteness .of a 3 X 3 matrix
a b d
A = b c e
J
e
f .
The best we can do is try the test in Theorem 5.6. If A fails to be positive
semi-definite, and if it fails to be negative semi-definite, then it must be
indefinite. Thus we first inspect a, c, /. If two have different signs, A is in
definite; if not we inspect the 2 X 2 principal minors
ac 62, cf e2, af d2.
If one of these is negative, A is indefinite; if not we inspect the determinant
\A\. If its sign differs from that of a, c, and e, then A is indefinite; if not, A is
positive or negative semi-definite.
252 7. LINEAR FUNCTIONS AND MATRICES
For instance, consider
A =
- 1 1 1
1 - 1 1
1
Then a = c = f = 1 < 0, so we pass to 2 X 2 principal minors: ac b2 =
cf e2 = af d2 = 0, so we still do not know if A is or isnt indefinite. But
|A| = 4 > 0, so A is indefinite. If this is not a convincing argument, then
here is an overwhelming one: set x = (1,0,0) and y = (1, 1, 1). Then
xAx' = - 1 < 0 and yAy' = 3 > 0.
EXERCISES
Write out the quadratic form corresponding to the symmetric matrix:
r 4 - i
1
- I 5
2.
L - i 3J
5 2
"1 0 0" "1 2 3
0 2 4 4. 2 4 6
O0
0
1_3 6 5-
3.
Find the symmetric matrix of the quadratic form:
5. x2y2 6. xy
7. (z + ?/)2 8. ( x2y)2
9. (ax + by)2 10. (ax + by) (cx + dy)
11. (x + y)(y + z) 12. (x + 2y + 3z)2
13. (ax + by + cz)2 14. (ax + by + cz) (ax + fiy + yz).
Determine whether the quadratic form is positive definite:
15. x2+ 4xy + 2y2
16 9x2 12xy + y2
17. x2+ fay + 10y2
18. 3:r2+ 2xy + y2 2zx + 3zy + z2.
19. Let A be any 3X3 matrix, not necessarily symmetric. Then/(x) = xAx' is ;
quadratic form. Find its matrix.
20. (cont.) Do the special case
^0 0 0'
A = 2 0 0
-4 6 0.
21. Prove that the sum of two positive definite quadratic forms is again positive
definite. Find the corresponding matrix.
6. Quadri c Surfaces 253
22. Let A be symmetric, x = (xi, x2, 3), and/(x) = xAx' . Prove that
OXi dx 2 d x s
23. Let A be any 3X3 matrix. Prove that J (A + A' ) is symmetric.
24. Let A be any 3X3 matrix. Prove that A A' is symmetric.
25. Suppose A and B are symmetric 3X3 matrices that commute, that is, AB = BA.
Prove that AB is symmetric.
26*. Suppose A and B are positive definite 2X2 matrices. Is the quadratic form
/(x) = xABx' positive definite? (Note that AB need not be symmetric.)
27. The form x2+ 2xy + 2y2is positive definite so there is a k > 0 such that x2+
2xy + 2y2> k (x2+ y2) for all (x,y). Find the largest such k.
28*. (cont.) Solve this problem in general for a positive definite
f i x ) = ax2 + 2 bxy + cy2.
29. Prove the statement in Remark 2, p. 250.
30. Carry out the proof suggested in Remark 1, p. 250.
31*. Let f ( x , y ) = ax2 + 2 bxy + cy2 be an indefinite form. Prove that f i x , y ) is the
product of two non-proportional linear functions. [Hint: Rotate coordinates.]
32. (cont.) Sketch in the plane the regions / > 0, / < 0, / = 0.
6. QUADRIC SURFACES
A quadric surface is the graph of an equation f (x, y, z) = 0, where
f (x, y, z) is a quadratic polynomial. The most general quadratic polynomial /
is the sum of a quadratic form and a linear function,
f (x, y, z) = Q (x ) + px + qy + rz + k.
The quadratic form Q involves the pure terms x2, y2, z2 and the mixed terms
xy, yz, zx. According to a fairly deep result in linear algebra, the mixed terms
can be eliminated by rotating the coordinate system. We shall accept this
result without proof. It means that we may confine our study of quadric
surfaces to the cases where there are no mixed quadratic terms.
Thus we consider f (x, y, z) =0, where
f i x , y, z) = Ax2 + By2 + Cz2 + px + qy + rz + k.
Of course, if there are no quadratic terms, then / = 0 represents a plane,
which doesnt interest us here.
If A 9* 0, then a translation in the ^-direction eliminates the term px, and
similarly if B 9* 0 or C 9* 0. This reduces our study to the following types of /:
(i) Ax2 + By2+ Cz2 + k
(ii) Ax2 + By2 + rz + k
(iii) Ax2 + qy + rz + k.
These include all possibilities, provided we are willing to permute the variables.
254 7. LINEAR FUNCTIONS AND MATRICES
For instance, the polynomial Cz2 + px + qy + k becomes type (iii) when x
and z are interchanged.
In type (ii), if 0, then a translation in the 2-direction eliminates k.
In type (iii), a rotation in the y, z-plane, taken so that qy + rz = 0 is the new
2-axis, changes the function to the form ax2 + qy + k. Again, k can be elimi
nated by translation if q 9* 0.
This reduces our study to the following types of /:
(1) Ax2 + By2+ Cz2+ k
(2) Ax2 + By2+ rz (2') Ax2 + By2 + k
(3) Ax2 + qy (3') Ax2 + k.
We begin with (1) in case ABC 7* 0 and k 9* 0. Then / = 0 can be written
in the form
x2 y2 z2
a1 bl cl
If all three signs are minus, the graph is empty. Otherwise the graph is sym
metric in each coordinate plane because if (x, y, z) is on the graph, so are all
eight points (d=, y, z ) . Therefore it suffices to draw the first octant por
tion of the graph and determine the rest by symmetry.
Ellipsoids
Consider the graph of
X2 V2 z 2
+ t 2 + 1, a, b, c > 0.
a2 b2 c2
Since squares are non-negative, each point of the graph satisfies
- < 1, - < 1, - < 1.
a2 ~ b2 ~ c2 ~
This means the graph is confined to the box
a < x < a, b < y < bj c < z < c.
Suppose c < z0 < c. The intersection of the graph and the horizontal
plane z = z0consists of all points (x, y, z0) that satisfy
x2 y^_ = z j
a2 b2 c2
This curve is an ellipse. It is as large as possible when z0 = 0, and it becomes
smaller and smaller as Zo------- >c or z0-------->c. Thus each such cross-
section by a horizontal plane is an ellipse, except at the extremes zo = c,
where it is a single point.
6. Quadri c Surfaces 255
The same argument applies to plane sections parallel to the other co
ordinate planes. This gives us enough information for a sketch. The surface
is called an ellipsoid (Fig. 6.1). In the special case a = b = c} it is a sphere.
(a) Horizontal cross-sections (b) the complete ellipsoid
are ellipses.
Fi g . 6.1 graph of ^ ^ ^ = 1
a o &
Hyperboloids of One Sheet
Consider the graph of
v2
r, 2
r tr _ * =
a2 b2 c2
a , b , c > 0.
Each horizontal cross-section is an ellipse
z = Zo
& r i ,
a2 b2 c2
no matter what value z0is. The smallest ellipse occurs for z0 = 0; as z0-
or zo------- oof the ellipses get larger and larger.
The surface meets the y, 2-plane in the hyperbola
t . - - = i
b2 c2
256 7. LINEAR FUNCTIONS AND MATRICES
and it meets the z, #-plane in the hyperbola
x* - i
a2 c2
This information is enough to sketch the surface, called a hyperboloid of one
sheet (Fig. 6.2a).
y = 0
?! _
a2 c2
(a) one sheet:
?! yl _ Z1 - i
a2 b2 c2
_ ?! _ yl 4- zi = i
a2 b2~t~c2
Fig. 6.2 hyperboloids
Hyperboloids of Two Sheets
Consider the equation
_ _ _ ^ x _ = 1
a2 b2 c2
If (x, y, z) is a point on the surface, then
X
a , b , c > 0.
c2 a2 o2
6. Quadri c Surfaces 257
hence z2 > c2. This means either z > c or z < c, that is, there are no points
of the surface between the horizontal planes z = c and z = c.
If Zo2 > c2, the horizontal plane z = z0meets the surface in the curve
Z = Zo
*2 + t = zl _ 1 > o
a2 b2 c2
an ellipse. Also the surface meets the y , 2-plane and the 2, rc-plane in the
hyperbolas
w + 1 1 and ~2 + 1 = 1
b2 c2 a2 c2
respectively. The surface breaks into two parts, and it is called a hyperboloid
of two sheets (Fig. 6.2b).
Cones
Now we complete the study of type (1) on p. 254 for the case ABC 9* 0,
but k = 0. Then / = 0 can be written
X2 V2 z 2
i i = 0, a , b , c > 0.
a2 b2 c2
If the signs are all the same, then (0, 0, 0) is the only point on the graph; not
interesting. Otherwise two signs are equal, the other opposite. Changing signs
if necessary, we are reduced to
2 x\ y 2 h ^
22 = + , a, b > 0.
a2 b2
It is a surface with the following property. For each point x0on the surface, the
entire line x = x0lies on the surface. Such a surface is called a cone, and the
lines x = tx0are called generators of the cone.
To check that the graph of
2 x 2 iy2
z = --------
a2 b2
is a cone, we take any point x0on the graph and check that t x 0is also on the
graph. If x0 = (%o, yo, 20), then t x 0= ( t x 0, t y 0, t z 0) and
)2 + (tyo)2
b2
Thus tx0is on the graph.
To sketch the cone, we note that it meets the horizontal plane z = 1 in
258 7. LINEAR FUNCTIONS AND MATRICES
the ellipse
- + - = 1 .
a2 b2
For each point on this ellipse, we draw the line through the point and 0. See
Fig. 6.3.
Parabol oids
Next we take up case (2) on p. 254 and study the graph of / = 0, where
f ( x , y, z) = Ax2 + By2 + rz, AB ^ 0, r ^ 0.
We may write the equation / = 0 in the form
x2 y2
z = , a, b > 0.
a2 b2
If both signs are minus, replacing z by z changes both signs to plus; there
fore we are reduced to two cases.
The first case is the surface
X2 V2
z = ~2 ^ h2 a , b > 0 ,
a2 b2
called an elliptic paraboloid. It lies above the x, ?/-plane, and it is symmetric in
the y, z- and z, ^-planes. Each horizontal cross-section
1 + U = 20> 0
a2 b2
is an ellipse, and these ellipses grow larger as z0increases. The graph meets the
y, z- and zy^-planes in parabolas z = y2/b2 and z = x2/ a2 respectively (Fig.
6.4a).
6. Quadri c Surfaces 259
(a) elliptic paraboloid:
3*
a2 62
(b) hyperbolic paraboloid:
* = - +
a2 62
Fi g . 6.4 paraboloids
The second case is the hyperbolic paraboloid, the locus of
)2
+ -
r2
_x _ r
a2 b2'
a, b > 0.
It is symmetric in the y, z- and z, ^-planes. The horizontal planes z = Zo > 0
meet it in hyperbolas whose branches open out in the y-direction. The hori
zontal planes z = Zo < 0 meet it in hyperbolas that open out in the rc-direction.
The y, 2-plane meets the locus in the parabola z = y2/ b2, which opens upwards;
260 7. LINEAR FUNCTIONS AND MATRICES
and the z, #-plane meets it in the parabola, z = x2/ a2, which opens down
wards. The best description is saddle-shaped. See Fig. 6.4b.
Cylinders
In cases (2') and (3) on p. 254, the variable z is missing. In general, when
one variable is missing the locus is a cylinder. Take for example Ax2 + By2 +
k = 0, where A > 0, B > 0, and k < 0. This can be written in the form
-2 + = a b > -
a2 b2
The surface meets each horizontal plane z = z0in the same ellipse. If (#0,2/o, zo)
is any point of the surface, the whole vertical line (#o, yo, z) for oo < z < oo
lies on the surface. The surface is an elliptic cylinder and these vertical lines
that lie on the surface are called generators of the cylinder (Fig. 6.5). Any
curve f (x, y) = 0 in the x, ?/-plane generates a cylinder in space consisting of
all points (#0,2/o, z) for which f ( x0, yo) =0. In particular, a circle generates a
(right) circular cylinder, a hyperbola generates a hyperbolic cylinder, etc.
Case (3) on p. 254 leads to a parabolic cylinder.
x2 y2
F i g . 6.5 elliptic cylinder: + = 1
ai bz
7. Inverses 261
In the final case (3') both y and z are missing. Depending on the signs of
A and fc, the locus of Ax2 + k = 0 is empty or consists of one plane or two
planes parallel to the y, 2-plane. In general, the locus in R3of f ( x) = 0 is a
set of planes parallel to the y, 2-plane. For each zero x0 of f ( x) , the plane
x = Xo is included in the locus.
EXERCISES
Sketch the first octant portion:
1. \ x2+ y2+ \z2= 1 2. z2+ \ y2+ iz2= 1
3. x2+ y2z2= 1 4. x2y2+ z2= 1
5. x2 y2+ z2= 1 6. x2+ y2 z2= 1
7. z = x2+ y2 8. z \ x2+ y2
9. z = x2+ y2 10. z = x2y2.
11. Identify the quadric surface z x2+ 2x + y2. [Hint: Complete the square.]
12. Identify the quadric surface z xy. [Hint: Rotate 45 around the 2-axis.]
Sketch the paraboloids:
13. x = y2+ z2 14. y = x2 z2.
Sketch the surface in x, y, 2-space:
15. x z = 1 16. y = x2
17. xy = 1 18. - x 2+ y2= 1
19. x = 22 20. y2+ 422= 1
21. z2= x2+ y2 22. z2+ 422= 1
23. y2= z2+ 4z2 24. z2xy.
25. Suppose f(x, y) = 0, 2 = 1 is a curve on the plane 2 = 1. Find an equation for the
cone obtained by taking all points on all lines through 0 and points of the curve.
26. (cont.) Test your result on x2+ y2= 1, z = 1.
27. Let f(y, z) = 0 be a curve in the y, 2-plane. Find an equation for the surface of
revolution obtained by revolving the curve around the y-axis.
28. (cont.) Test your result on the curve y2+ 9z2= 1.
7. INVERSES
The square matrix I with ones on the main diagonal and zeros everywhere
else is called the identity matrix. The 2 X 2 and 3 X 3 identity matrices are
1 0
0 1
1 0 0
0 1 0
0 0 1
The matrix I behaves very much like the number 1. If x' is a column vector,
then I x' = x'; if A is a matrix (the same size as I ) then I A = AI = A.
262 7. LINEAR FUNCTIONS AND MATRICES
Here is an important practical question. Given a 2 X 2 o r 3 X3 matrix A,
is there a matrix B (of the same size) such that AB = I and BA = /? Not
necessarily; for example, if
1 0
and B =
'bu b 12
0 0 _&21 b 22
then
bn b 12 'bn
O'
AB = BA =
0 0 b 2i 0
Therefore, both equations AB = I and BA = I are impossible for any choice
of B.
Let A be a square matrix for which there exists a matrix B satisfying
AB = I and BA = I. Then we write B = A~l and call A~l the inverse
of A. Thus
AA- 1 = A~'A = I .
R e m a r k : It is reasonable to say the inverse. For if AB = BA = 7, and
also AC = CA = /, then
B = BI = B( AC) = CBA) C = I C = C.
Some matrices have inverses and some do not. How can we tell whether A~l
exists? We shall derive a simple test using determinants.
Recall that to each square matrix A is associated a real number \A\,
the determinant of A. For instance |/| = 1, as is easily checked. One of the
most important properties of determinants concerns the determinant of a
product of matrices.
Theorem If A and B are square matrices of the same size, then
________________________ 1^1 = |A[ IB| .________________________
The proof is not easy; we shall omit it.
Suppose A has an inverse, B. Then AB = I, so by the theorem above
\AB\ = |/|, \A\ |2?| = 1 .
Conclusion: \A\ j* 0. Thus for A to have an inverse, it is necessary that
\A\ 7* 0 . We are going to show that this condition is sufficient as well. It will
then follow that the matrices possessing inverses are precisely those whose
determinants are non-zero.
R e m a r k : The situation is similar to that for reciprocals of real numbers.
The real number a has an inverse arl such that aar1 = arla = 1 if and only if
a 0.
7. Inverses 263
\
Suppose \A\ 7* 0. We shall prove that A~l exists by actually computing it.
The cofactor matrix cof A of a 2 X 2 matrix A is defined by
Cofactor Mat r i x
a b d - b
, cof A =
c d_ c a
For a 3 X 3 matrix, cof A is defined by
mn - m 21 rrizi
cof A = m 12
rri22 m32
miz rri2z m33.
Here mtyis the minor of ay in A. Note that the signs alternate and that the
i, /-element of cof A is not =tm*y. Precisely, cof A = \_dj] where
C{j = ( 1 )
The cofactor matrix cof A has an important relation to the original matrix
A. We shall prove the basic formula:
A (cof A ) = (cof A) A = \ A\ I.
This is easy for a 2 X 2 matrix:
A (cof A ) =
a 6 d 6 ad be 0
^
3
1
e 0
1
1
0 ad bc_
w i ,
and similarly (cof A) A = \A \ I.
For a 3 X 3 matrix we have
mu -7/121 mzi an au aiz
(cof A ) A = mi2 ni22 niz2 21 22 a2z
mi3 - m 2z mSz_ _azi &32 azz_
Call the product [& /] A typical element on the principal diagonal is
6 i i = ninCLn Wi2i&2i + wizid&i.
This is nothing else but the expansion of \A\ by minors of the first column.
Therefore bn = \A\.
A typical element off the diagonal is
&12 = VI11CL12 VI21CI22 ^ ^ 31^ 32*
264 7. LINEAR FUNCTIONS AND MATRICES
t
It is the expansion by minors (of the first column) of
&12 &12 &13
Cl 22 $22 ^23 &32 ^32 (I33 But this is the determinant of a matrix with two equal columns. Therefore 6 1 2 0 . We see that all elements on the principal diagonal of (cof A) A are |4|, and all elements off are 0. Therefore (cofyl)il = |A| I. Using expansion by minors of rows, we can prove A (cof A) = \A \ I similarly. Example: ' 3 0 -1" A = cof *4 = + + 1 - 2 - 3 1 2 - 2 4 1 2 1 4 - 3 + 2 1 4 - 3 0 - 1 - 3 1 3 - 1 4 1 3 0 4 - 3 - 2 1 + + 0 - 1 1 - 2 3 - 1 2 - 2 3 0 2 1 - 5 3 1 -1 0 7 4 -1 0 9 3 A (cof A ) = - 5 1 = (cof A) A, \A\ = - 5 . Let us return to the question of inverses. Assume |A| ^ 0. From the rule A (cof A ) = (cof A )A = \A\ I, follows A matrix A has an inverse if and only if \A\ 9* 0. If \A\ 9* 0, then A~l = cof A. \ A \ 7. Inverses 265 Example 1: A = 5 3 - 3 1 \A\ = 4 5* 0. cof A = 1 -3" i 1 -3" A - 1 = ---- - 3 5 4 - 3 5 Check: 1 l C O 1 1 - 3 ' 1 ' o 1 1 4 3 1 . - 3 5 ~ ~ 4 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 1 o 1 = 7. Example 2: A = 1 0 1 2 - 1 1 - 2 - 2 1 \A\ 5 0. 1 - 2 r 1 - 2 1 cof A = - 4 3 1 A- i - - - A " 5 - 4 3 1 - 6 2 -1_ 6 2 1_ Check: A~lA = 1 - 2 - 4 - 6 3 1 2 - 1 1 0 1 - 5 0 0 2 - 1 1 1 5 0 - 5 0 = 7. - 2 - 2 1 0 0 - 5 T e r mi n o l o g y : A square matrix that has an inverse is called non-singular. A square matrix that does not have an inverse is called singular. Thus a square matrix A is non-singular if and only if \A\ ^ 0, and A is singular if and only if |A | = 0. R e m a r k : Given A , if there is a B with AB = 7, then \A\ 0 so A has an inverse. This inverse must be B because, A- 1 = A - 1(I) = A - 1(AB) = (A~1A) B = I B = B. Similarly if CA = /, then A has an inverse and A~l = C. 266 7. LINEAR FUNCTIONS AND MATRICES Linear Systems A 3 X 3 linear system (three equations, three unknowns) anXi + anX2 + i3^3 = bi * Cl2l X l -|- &22X2 1 d 23%3 = 62 , (L31#1 + ^32^2 + 33^3 = 63 can be written neatly in matrix form as Ax' = b', where a 11 d l 2 d 13 Xl "61" 021 Cl22 23 , x' = X2 , b' = 62 _&31 ^32 & 33_ _ X3_ Suppose the matrix A is non-singular. Then we can multiply both sides of the equation by A-1: A - 1(Ax' ) = A~lb', Ix' = A~l b', x' = A~l b'. Therefore, if there is a solution, it can only be the vector x' = A~lb'. But this vector is in fact a solution. Check: Ax' = A( A~lb' ) = (AA~l)W = l b' = b'. A linear system Ax' = b' with A non-singular has the unique solution x' = A~lb' . Remark: Note the formal similarity between the linear system Ax' = b' and the elementary equation ax = b. The first is solved by multiplying both sides by A~~l, the second by multiplying both sides by 1/a = arl. EXAMPLE 7.1 Solve "3 0 - 1 xi 15" 2 1 - 2 x% = 0 _4 - 3 1_ _ 5_ Solution: Evaluate \A\, say by minors of the first row: |il| = 3 ( - 5 ) + 0 + ( 1) ( 10) = - 5 . 7. Inverses 267 Since |A| 9* 0, the matrix A is non-singular, and - 5 3 1 A-1 = 77 cof A = - - 1 0 7 4 w 5 - 1 0 9 3 Therefore the solution is x' = A - 1b' 5 - 5 3 1 -1 0 7 4 - 1 0 9 3 "15" "14" 0 = 26 _ 5_ _27_ Answer: xi = 14, = 26, #3 = 27. The solution of Ax ' = b' is x' = A _1b' = t~t (cof A )b ' . \M From this formula follows Cramers Rule for the solution of a linear system. If rriij denotes the minor of an in A, then (cof A) b' = mu ra21 m31 mi2 m22 m32 mi3 m23 m33 W21161 77^2162 H"nisibs 777-1261 -| - 7722262 7723263 7ni36i 7^2362 + m3363 Look carefully at the first element, mn6i m2i62 + 7723163. It is the value of the determinant 6 1 a\2 i 3 6 2 & 2 2 ^23 6 2 &32 CZ33 obtained as the expansion by minors of the first column. A similar statement applies to the other elements. The result is a set of explicit formulas: Let A = (ci', c2', c3') be a 3 X 3 non-singular matrix. Then the solution of Ax' = b' is given by [x\, x2, 3]', where Xi = D ( b', c2', c3' ) Z)(ci ' , b', c3' ) Xi = ------ rn------ > xs D( ci', c2', b' ) |A| ' | A | ' | A | In other words, to obtain x i} replace the 2-th column of A by b', take the determinant, then divide by |A|. 268 7. LINEAR FUNCTIONS AND MATRICES EXERCISES Find A1where A is: Find A1where A is: 1 1 1" " 2 0 0 11. 1 - 1 2 12. - 1 3 0 -1 1 4_ _ 1 2 - 1_ 2 2 6" " - 1 2 - 9 13. 1 2 9 14. 1 3 - 2 _ 1 3 2_ - 2 2 6- 4 - 3 1 "1 1 3 15. 2 1 2 16. 2 4 7 .3 0 1- 1t o 0 0 9- Solve by Cramers Rule: 1 1 - 1 3 "0 2 3 "0" 17. 1 - 1 1 x = 2 0 0 H 1 1 1 x = 1 _ - l 1 1- - 1- -0 3 5- -0- " 2 1 3 " " 1 I C O 1 1 < N 1 1 19. - 1 4 2 x = 0 20. ( N 1 r H i x = 1 - 3 1 1 - - - 1- . 3 - 1 - 1- - 1_ 21. Let A and B have inverses. Prove that AB has an inverse and (AB) 1 = B lA \ 22. (cont.) Prove that cof (AB) = (cof B) (cof A ) if A and B are non-singular. 8. Characteri st i c Roots 269 23. Let A have an inverse. Prove that its transpose A' has an inverse and (A')_1 = (A*1)'. 24. Prove that a triangular matrix A = "an ai2 d\z 0 022 &23 0 0 a33 has an inverse if and only if ana2233 9^ 0. If this condition is satisfied, prove that A1is also triangular. The following exercises outline another method for inverting matrices. Set 0 1 0 1 0 0 1 0 c~ Pl2 = 1 0 0 , A(c) = 0 1 0 , R*i(c) = 0 1 0 -0 0 1_ -0 0 C - _0 0 1_ 25. Show that PuA is the result of interchanging row 1 and row 2 of A. 26. Show that D3(c)A is the result of multiplying row 3 of A by c. 27. Show that R3i (c )A is the result of adding c times row 3 of A to row 1. 28. Write out P23, >1 (c), and #12(0). 29*. Let A be non-singular. Show there are matrices Qi, Qz, , Qn such that (1) QnQni* *Q2Q1A = I, (2 ) each Q is a Py or a Di(c) with c 9* 0 or an Rij(c). 30. (cont.) Interpret this statement as follows: Apply a sequence of elementary row operations to A (interchange rows, multiply a row by a non-zero constant, add a multiple of one row to another) so as to change A to I. Apply the same sequence of operations to I. The result is A-1. Try the method of Ex. 30 on the A of: 31. Ex. 1 32. Ex. 2 33. Ex. 11 34. Ex. 12 35. Ex. 13 36. Ex. 16. Let A = A (t) = [aty(0] be a matrix whose elements are differentiable functions. Define its derivative by A' = [_dn(t)~\. Prove: 37. (A + BY = A* + B- 38. (/A)* = /*A +/A* 39. (AB)' = A- B+ AB' 40. (A-1)* = A-1 AA-1. 8. CHARACTERISTIC ROOTS [optional] Let A be a 2 X 2 matrix. A characteristic vector of A is a non-zero column vector x' such that Ax' is a multiple of x': Ax' = Ax', where Xis a scalar. Given a vector x', the vector Ax' generally does not resemble x' at all, whereas for a characteristic vector, Ax' is simply a multiple of x'. Therefore a 270 7. LINEAR FUNCTIONS AND MATRICES characteristic vector is a rare animal, in fact, it is not obvious whether A has any characteristic vectors at all. One thing is clear though: if x' is a charac teristic vector, then so is any non-zero multiple y' = ex'. For Ay' = A (ex') = c(ix') = c(Xx') = X(cx') = Xy'. Let us investigate the existence of characteristic vectors. We write the condition Ax' = Xx' in the form (XI - A) x' = 0, and ask whether there is a non-zero x' that satisfies this homogeneous system for some scalar X. Set A = The question we ask is whether there is a scalar Xfor which the homogeneous linear system a b X and x' = c d JJ. Xa b X " 0" c \ d JJ. _0 has a non-trivial solution. The answer is yes if and only if there is a Xfor which the determinant of the system is 0. Thus Xmust satisfy/(X) = 0, where f i t ) = |tl - A\ = = it t a b c t d a) (t d) be = t2 (a + d)t + (ad be). The quadratic polynomial f(t ) is called the characteristic polynomial of A, and its zeros are called the characteristic roots of A . Conclusion: For each characteristic root Xof A, there is a characteristic vector x' with Ax' = Xx'; conversely, each characteristic vector corresponds in this way to a characteristic root. Terminology: Several adj ectives are often used where we have used characteristic. The most common ones are proper, eigen (from Ger man), and latent. Characteristic roots are also called characteristic values. EXAMPLE 8.1 Find all characteristic roots and vectors of '2 - 1 ' .1 4 3 ## 8. Characteri st i c Roots 271 Solution: f i t ) = \ t l - A\ = t - 2 1 - 4 t + 3 = (!)( + 2). = (t - 2)(* + 3) + 4 = t2 + t - 2 The characteristic roots are X= 1 and n = 2. The corresponding homo geneous systems (XI 7- A) xr = 0 and I A ) y' = 0 are y' = 0. " - 1 1 "-4 1" x' = 0 and 1 1 - 4 1 Obvious non-trivial solutions are Y Y x' = and y' = 1 4 Any non-zero multiple is also a characteristic vector. Y Y Answer: X= 1, x' = a 1 ; m*= - 2, y' = b 4 The characteristic polynomial f i t ) of a 2 X 2 matrix A is quadratic. It has two complex roots which may or may not be real. Hence A will always have a characteristic vector if we admit complex scalars and vectors with complex entries. Otherwise, the discussion that follows applies only to matrices with real characteristic roots. (See Chapter 15 for complex numbers.) Now f ( t ) = t2 (a + d)t + (ad be), so the two characteristic roots are real provided A = (a + d)2 - 4 (ad - be) > 0. The roots are distinct if A > 0, equal if A = 0. Basic Structure Theorems Suppose a certain problem involves a 2 X 2 matrix about which we know nothing. Now the general 2 X 2 matrix has four arbitrary elements, so is awkward to handle. The following results show that in many situations a general 2 X 2 matrix can be replaced by one of two simple types, "X 0" or X f _0 _0 X There are many practical applications. 272 7. LINEAR FUNCTIONS AND MATRICES Theorem 8.1 Let A be a 2 X 2 matrix with distinct characteristic roots X and /i. Then there is a non-singular matrix P such that P- ' AP = X 0 0 j u Proof: Let x' and y' be corresponding characteristic vectors. Then Ax' = Xx', Ay ' = ny', x' 0, and y' ^ 0. Set P = (x', y' ), considered as a 2 X 2 matrix. Then AP = A( x' , y' ) = (Ax', Ay' ) = (Xx' ,y' ) = (x' , y' ) If P is non-singular, then we multiply the last equation by P~l : X o' 'x o' = p 0 M 0 M_ / ' x o' \ = P- ' ( AP) = p- - l ( p V 0 n / ' x o' X 0" "X 0" = (P~' P) = I = o M. 0 j U 0 n It remains to prove that P is non-singular. If not, then the homogeneous system a 0 has a non-trivial solution. That is, there exist a and b, not both zero, such that ax' + by' = 0. Multiply by A : aAx' + bAy' = 0, aXx' + buy' = 0. Now eliminate x' from ax' + by' = 0 and aXx' + buy' = 0: b(X n)y' = 0. Since X^ /xand y' ^ 0, we have 6 = 0. Similarly a = 0, a contradiction. EXAMPLE 8.2 Apply the theorem to the matrix of Example 8.1. Solution: The matrix is A = 2 - 1 4 - 3 8. Characteri st i c Roots 273 and we found the characteristic vectors Y Y x' = and y' = 1 4 corresponding to X= 1 and n = 2. Set P = (x', y') = 1 1 1 4 Then |P| = 3 9* 0 and '2 - 1' 1 1" '1 2 AP = = j 4 - 3 1 4 1 8. '1 O' "1 ll "1 0" "1 - 2 0 - 2 1 4 _0 - 2 1 - 8 '1 O' '1 O' I I p- i Ap = O 1 f c O 1 O 1 t o Theorem 8.2 Let A be a 2 X 2 matrix with equal characteristic roots X, X. Then there is a non-singular matrix P such that either '\ O' x 1' P- ' AP = or P~lAP = 0 X 0 X Proof: Let x' be a characteristic vector, so x' ^ 0 and Ax' = Xx'. Choose any vector y' so that P = (x', y') is non-singular. Then x' and y' are non-collinear with 0, so the vector Ay' can be expressed as Ay' = ax' + fry'. In fact, if we set then p-iAy' = Ay' = (PP_1) (Ay') = P(P~lAy') = P = ax' + by'. 274 7. LINEAR FUNCTIONS AND MATRICES Therefore AP = A (x', y') = (Ax', Ay') = (Xx', ax' + by') X a X a = (x', y') = P 0 b 0 b hence P_1AP = X a 0 b Now an important observation: P_1AP and A have the same characteristic polynomial. For \tl - P_1AP| = |tP~lP - P_1AP| = |P~l (tl - A)P| = IP-1! \tl - A| |P| = It l - A\ IP-1! |P| = \tl - A| \P~lP\ = It l - A|. Let/(0 denote this common characteristic polynomial. On the one hand, the zeros of f ( t ) are X, Xby hypothesis. On the other hand, f ( t ) = \tl - P- ' AP| = t X a 0 t - b so the zeros are X, b. Hence X= b. Therefore = (t - \ ) ( t - 6), Ax' = Xx' X a and P~lAP Ay' = ax' + Xy' x- If a = 0, we are done. If a 5* 0, we simply repeat the argument with x' replaced by ax'. Then Ax' = Xx', Ay' = x' + Xy' so, with P modified accordingly, X 1] P~lAP = 0 X This completes the proof. EXAMPLE 8.3 Apply the theorem to -1 0 3 - 1 8. Characteri st i c Roots 275 Solution: The characteristic polynomial is / (t) = (<+ l )2 so the characteristic roots are 1, 1. Let x' be a characteristic vector. Then \ I A = I A = 0 0 -3 0 so we must have An obvious solution is The easiest choice of y' is T y' = o o o - 3 o x' = x' = 0. <f 1 so P = (x', y') = Then Ax' = x' and Ay' = Therefore - 1 0 3 - 1 0 1 1 0. - 1 3 3x' - y'. r. AP = A{x' , y' ) = {Ax', Ay' ) = ( - x ' ^ x ' - y') "-1 3' ' - 1 3' (x', y') = P 0 - 1 0 - 1 We replace x' by 3x', that is, we now take After these changes, "O' 'l" 0 1' I I I I i i C O _o - - - - - - - - - - 1 O C O so AP = P - I " - x' and I I I I x n 1 3 " - 1 1" - i r P~lAP = 0 - 1. 0 - 1 Ii ImII We shall apply these results in Chapter 15, Section 6 to systems of differ ential equations. 276 7. LINEAR FUNCTIONS AND MATRICES 3 X 3 Matrices For a 3 X 3 matrix A, characteristic vectors and roots are defined j ust as * for 2 X 2 matrices. A characteristic vector of A is a column vector x' such that Ax' = Xx' for some scalar X. Searching for characteristic vectors leads to the characteristic polynomial f{t) = \tl A |, whose zeros are the characteristic roots of A . Now t - au - 0 1 2 013 m = 021 t CJ22 023 ts -1- bt2 -|- ct -f- d. 031 032 t 033 a cubic polynomial with real coefficients. Since / has odd degree, there must be at least one real characteristic root. In general, the cubic has either one real and two complex zeros or three real zeros. Theorem 8,3 Each 3 X 3 matrix has a real characteristic root. Corresponding to the structure theorems above is a more complicated result which we state without proof. Theorem 8.4 Let A be a 3 X 3 matrix, and let X, n, v be the characteristic roots of A ynot necessarily all real or distinct. Then there is a non-singular matrix P such that P~lAP has one of the following forms: X 0 0 X 1 0 X 1 o' 0 n 0 J 0 X 0 J 0 X 1 oo_0 0 n_ S < o O R e m a r k : There is only one possible basic form if all three characteristic roots are distinct, but three basic forms if the three characteristic roots are equal, and two basic forms if two roots are equal and the third distinct. Symmetric Matrices All characteristic roots of a symmetric matrix are real. That is, if A is an n X n symmetric matrix (with real coefficients), then the characteristic polynomial/ ( 0 factors completely, f i t ) = (t - Xi )(* - X2) - - - ( * - Xn), where Xi, , Xn are real. The proof of this basic fact, while not hard, belongs to a course in linear 8. Characteri st i c Roots 277 algebra. The 2 X 2 case is easy. For if A = then f i t ) = t a b a b b c = (t a) (t c) b2 b t c = t2 (a + c)t + (ac 62), and the discriminant is A = (a + c)2 - 4 (ac - b2) = (a - c)2 + 4b2 > 0. Therefore the roots are real. We shall prove the 3 X 3 case in Chapter 9, Section 8, but the analytic proof there does not generalize to 4 X 4 or higher cases. We state the result formally. Theorem 8.5 All characteristic roots o f a 2 X2 o r 3 X3 symmetric matrix are real. The basic structure theorems take a particularly simple form for symmetric matrices. The possibilities A 1 0 x simply never occur! The following theorem summarizes the situation. "x 1 0 "A 1 0" 0 X 0 > 0 A 1 _0 0 _0 0 A_ Theorem 8.6 If A is a 2 X 2 symmetric matrix, then there is a non-singu lar matrix P such that A 0" P~lAP = 0 /i I f Ai sa3 X 3 symmetric matrix, then there is a non-singular matrix P such that "A 0 0" P~lAP = 0 /x 0 0 0 P 278 7. LINEAR FUNCTIONS AND MATRICES The proof for 3 X 3 matrices is beyond the scope of this course. See Ex. 35 for the 2 X 2 case. Now we come to an important relation between characteristic roots and definiteness. Theorem 8.7 Let A be a symmetric matrix with characteristic roots Xi, X2, . Then (1) A is positive (negative) definite if and only if all X* > 0 (all X; < 0); (2 ) A is positive (negative) semi-definite if and only if all Xt > 0 (all Xi < 0 ); (3) A is indefinite if and only if some Xt > 0 and some Xy < 0. We shall prove only a small part of this result because the complete proof, especially for 3 X 3 and higher, is hard. We shall show that if A is positive definite, then all Xi > 0. The negative definite case is similar. Thus let A be positive definite and let Xbe any characteristic root. There is an x V 0 such that Ax' = Xx'. Therefore xAx' = Xxx'. But xAx' > 0 because A is positive definite and x' ^ 0; also xx' > 0 because x' 0. Therefore X> 0. EXERCISES Find all characteristic roots and vectors: 2 1. 3. 5. 7. 9. 11. j a 0 '1 2" 4 3 -3 1 -1 - 1 4. 6. 8. 10. 12. C O i o r o1_ "1 1 1 o1_ 1 O' 1 1 1 - 1 - 2 2 T " 1 - 1. - 5 - r r 1 4 2. 4 91 _1 J 8. Characteri st i c Roots 279 [-1 1 ] C -I] - n r a ca- 17. Suppose be > 0. Prove that A has real characteristic roots. 18. (cont.) Under the same hypothesis, find when the roots coincide. 19. Prove X+ \x = a + d and X/x= ad be. 20. Prove A', the transpose, has the same roots as A. Find P so that P~lAP H ; a ca n 23. A Find P so that P~lAP -C 3 ' - [c.a - - ua 1r- 1 0 1 1 r 4 91 25. 26. 1L i - 1 J 1Li 2j Find all characteristic roots and vectors: "3 0 0 2 1 0 27. 0 1 0 28. 0 2 0 _0 0 2_ -0 0 1_ - 2 0 0" "3 1 0 29. 1 - 2 0 30. 0 3 1 . 0 1 2_ -0 0 3_ 4 0 9 3 0 0 31. 0 0 0 32. 0 1 2 _1 0 2. -0 1 2. 33*. Let A c < n- -ca have characteristic roots X< /x. Prove that X< a < n and X< 280 7. LINEAR FUNCTIONS AND MATRICES 34*. Let A be a symmetric matrix and x' a characteristic vector. Let y' x' = 0. Prove that (Ay') x' = 0. 35*. (cont.) Prove Theorem 8.6 for 2 X 2 symmetric matrices. 36. Let A be a 3 X 3 matrix with characteristic roots X, /z, v. Prove that |A| = \nv. [Hint: Set t = 0 in the identity |tl A\ = (t X) (t n) (t y).] 37*. (cont.) Prove X+ j u+ v = an + a22 + 33- 38*. Let A be a 3 X 3 symmetric matrix. Prove that A = 0 if and only if 0 is the only characteristic root of A. 39. (cont.) Show this is false for some non-symmetric A. 40. Let Xbe a characteristic root of A. Prove that X2 is a characteristic root of A2. 8. Several Vari abl e Differential Calculus 1. DIFFERENTIABLE FUNCTIONS It may be said that differential calculus is the study of functions which have a linear approximation at each point of their domains. We shall extend this point of view to functions of several variables, but first let us review the one variable situation. Suppose/(#) has derivative/'(c) at x = c. Then for x near c, the approxi mation f ( x) ~ f { c ) + f ' {c) {x c) is quite accurate. The term accurate is not well defined; let us express what we mean geometrically. We know that V = /(c) + f {c)(x c) is the equation of the tangent line to the graph of y = f i x ) (c,/(c)). See Fig. 1.1. Geometrically, the tangent hugs the curve near the point of tangency. Therefore if e(x) is the vertical distance between the curve and the tangent line, e (x) ought to be small compared to \x c|. In fact, the closer x is to c, the smaller the ratio e (x)/\x c\ should be. Let us now state these ideas more precisely. Suppose / is differentiable at a point c of an open interval D. Set f i x ) = /(c) +f ' { c ) ( x - c) + e(x). Then e(x) 1 -------- >0 as x -------->c. \x c| This statement characterizes the number /'(c). It suggests that the deriva tive could have been defined by an approximation property: Suppose there exists a number k such that f i x ) = /(c) + k{x - c) + e(x), where e(x) -------- - -------- >0 as x ------- >c. \ X - C\ Then / is differentiable at c and k /' (c). 282 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS This definition of derivative implies the usual one. For if the condition is satisfied, then f ( x) f (.c) e(x) = k + x c [ -> k + 0 as X' -> c. x c Hence/'(c) = k. Let us change our notation slightly. We shall always assume that c is an interior point of a domain D. Given such a c, there is a small interval (c 8, c + 8) contained in D. Hence we can denote points in D near c by c + x where |#| < 8. Using this notation and writing e(x) instead of e(x + c), we may restate the basic facts about differentiability as follows: Suppose/is differentiable at a point c of an open interval D. Set /(c + x) = /(c) + f ( c ) x + e(x). Then e( x) / \ x\ ------- >0 as x ------- 0. Conversely, suppose there exists a number k and 8 > 0 such that /(c + x) = /(c) + kx + e(x) for |#| < 8, where e ( x) / \ x\ ------- >0. Then / is differentiable at c and k = /' (c). For a function of one variable, the relation is clear: if the derivative exists, the linear approximation exists; if the linear approximation exists, the deriva tive exists. For a function of several variables, the situation is not at all like this. It can happen that df/dx and df/dy both exist at a point (a, 6), but no linear approximation c + hx + ky exists! 1. Di f f er ent i abl e Functi ons 283 EXAMPLE 1.1 Define / by jj m o) = o (x, y) ^ (0, 0 ). Prove that df/dx and df/dy both exist at (0, 0 ), but/ does not have a linear approximation at (0, 0 ); indeed, / is not even continuous at (0, 0 ). Solution: Clearly/(#, 0) = 0 for all x. Hence f (o, o ) - < ) dx ax = 0. (0 ,0) Similarly df/dy = 0 at (0, 0), so both partials exist and are zero. Now f ( x , y ) = 0 at each point of the x- and ?/-axes, so the only conceivable linear approximation is 0 + Ox + Oy = 0. But for x = y 9* 0, f ( x , x ) = 2x2 X2 + X2 = 1. Hence /(#, y) = 1 at all points of the line y = x except (0, 0) so/is not con tinuous at (0, 0 ) and certainly 0 does not begin to approximate /. R e m a r k : It is also possible to construct an example of a continuous function whose first partials exist, but which fails to have a linear approxima tion. See Exs. 1-3. Di fferenti abl e Functions As we have j ust seen, the mere existence of both partials of f (%, y) may not guarantee that f (x, y) is a reasonably behaved function. We shall there fore limit our attention to functions that have good linear approximations at each point. Recall that each linear function L : R2------- >Ris given by L(x) = k*x for some fixed vector k. Lt us now define differentiability for functions of two variables, that is, functions/: D-------->R, where D is a subset of R2. The discussion will extend easily to functions of three or more variables. 284 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS Di f f er ent i abl e Function Let / be defined on a domain D of the plane R2and let c 6 D. Then/is differentiable at c if there exist a linear function k*x and a function e(x) such that (i) / ( c + x) = / ( c ) + k*x + e(x) whenever c + x D, p (x) (ii) - f - f ------- >0 a s x ------- >0. The definition is completely analogous to the one for functions of a single variable. Just compare the relations / (c + x) = /(c) + kx + e( x) , e( x) / \ x\ ------- >0, and / ( c + x) = / ( c ) + k*x + e(x), e(x) / |x| --------0. It is instructive to write out the definition in coordinate form: A function / is differentiable at (a, 6) if there exists a linear function hx + ky such that f ( a + x, b + y) = / ( a , b) + hx + ky + e(x, y) , where e(x, y) V x2+y2 0 as (*, j /)------- >(0, 0 ). An immediate consequence of differentiability is continuity. Theorem 1.1 If / is differentiable at (a, b), then / is continuous at (a, b). Proof: Let f ( a + x, b + y ) = f(a, b) + hx + ky + e(x, y) as in the definition. If (x, y ) ------- (0, 0), then x ------- 0, y ------- >0, and e(x, y ) ------- >0. Consequently f ( a + x, b + y ) ------- >/(, b). This is con tinuity. The mere existence of partial derivatives at (a, b) does not guarantee differentiability, as we noted in Example 1.1. However differentiability does guarantee the existence of partial derivatives. Theorem 1.2 Let/be differentiable at an interior point (a, b) of its domain, and let /(a + x, b + y) = /(a, b) + hx + ky + e(x, y) , where e(x, y ) / \ ( x , y ) \ ------- 0 as 0x, y ) ------- (0, 0). Then the first partials of / exist at (a, b) and (a, b) = h, j - (a, b) = k. dx dy 1. Di f f er ent i abl e Functi ons 285 Proof: df , ^ v f ( a + x, b) - f ( a, b) (a, b) = lim--------------------------- to v hx + e( x, 0) e(z, 0) = lim--------------- = h + lim-------- = h. x->0 X x-*0 X Similarly f y (a, b) = k. Again there is a strong analogy between the preceding formula and the corresponding one for functions of one variable. Just compare the relations f ( c + x) = /(c) + (- ~ } - x + e(x) ax and f ( a + x, b + y) = f(a, b) + x + y + e(x, y). dx dy A Test for Di ff erenti abi l i t y The next proof uses the Mean Value Theorem for functions of one variable. Let us review its content: Mean Val ue Theorem Suppose / is continuous on a closed interval a < x < b and differentiable on the open interior a < x < b. Then there exists c such that a < c < x and f i b) - f ( a ) = f (c) (b - a). We are prepared for a practical test for the differentiability of a function. Theorem 1.3 Suppose / has domain D and (a, b) is an interior point of D. Suppose (1) df/dx and df/dy exist at each interior point of D and (2 ) these partials are continuous at (a, b). Then/ is differentiable at (a, b). Proof: Write /(a + x, b + y) - /(a, b) = [ / ( a + x, b + y) - f (a, b + y)~\ + [/(a, b + y) - f (a, 5)]. By the Mean Value Theorem, applied twice, /(a + x, b + y) /(a, b + y) = xfx(a + dx,b + y) /(a, b + y) - f (a, b) = yfy{a, b + \ y) , where 0 < 0 < 1 and 0 < X< 1. (But 0 and Xdepend on x and y.) Set (1) /(a + x, b + y) = /(a, b) + xfx(a, b) + yfy (a, b) + e(x, y ), so that (2 ) e(s, 2/) = xg(x, y) + yh(xyy), where d(x, y) = fx(a + dx, b + y) - f x(a, b) (3) h( x9y) = f y(a, b + \ y) - f v(a, b). Relation (1) implies that/is differentiable at (a, b), provided, we prove that e (X>V) ^ / x /n m ... ---------------------- >0 as ( x , y ) --------> (0, 0 ). + y We obviously have (a + dx, b + y ) -------- (a, 6) and (a, b + \ y ) ------- >(a, b) as (x, y ) ------- (0, 0). This implies g( x>y)------- >0 and h ( x , y ) -------->0 as (x, y ) ------- > (0, 0 ) by (3) and the assumption that/* and f y are continuous at (a, b). Now divide both sides of (2 ) by y / x 2 + y2, take absolute values, and apply the triangle inequality: 286 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS e(x, y) y / x 2 + y2 < \g(x + / i f L 2 y / x 2 + y2 y / x + y2 < !(*, y )I + \h(x, y ) I- Clearly \g(x, y)\ + \h(x, y ) | -------->0 as (x, y ) ------- >(0, 0). This completes the proof. Three Variabl es For three or more variables, the definitions and theorems are easy exten sions of those for two variables. We shall merely state the definition of differen tiability for three variables and leave the extensions of Theorem 1.1-1.3 as exercises. Di f f erent i abl e Function Let /be defined on a domain D of three-space R3, and let c D. Then / is called differentiable at c if there exist a linear func tion k - x and a function e(x) such that (i) / ( c + x) = / ( c ) + k*x + e(x) whenever c + x D, p (x) (ii) j j - -------->0 as x ------ -0. IXI The function / is called differentiable on D provided it is differentiable at each point of D. 2. Chain Rul e 287 Exactly as for two variables, it follows from this definition that k=(d l d l tf\| \ dx dy 9dz) Ix=c The proof is left as an exercise. EXERCISES Show that f (x, y) is differentiable at (a, b) by finding a linear function hx + ky that satisfies the definition of differentiability: 1- f(x, y) = xy2at (0, 0 ) 2*. f i x, y) = 1 /xy at (1, 2 ). 3. Define/on R2by/(0, 0) = 0 and f(x, y ) = for (X, y) ^ (0, 0 ). xL-+- y Prove / is continuous on R2. 4*. (cont.) Prove that df/dx and df/dy exist on R2and are continuous except at (0, 0). 5. (cont.) Prove that/is not differentiable at (0, 0). 6. Prove that the sum of two differentiable functions is differentiable. 7. Prove that the product of differentiable functions is differentiable. 8. Let g be differentiable at (a, b) and g(a,b) j* 0. Prove that l/g is differentiable at (a, 6). 9. (cont.) Prove that the quotient of differentiable functions is differentiable when ever the denominator is not zero. State and prove for three variables: 10. Theorem 1.1 11. Theorem 1.2. 2. CHAIN RULE The Chain Rule for functions of one variable gives the derivative of a composite function: if y = f i x ) where x = x(t), then dy dy dx dt dx dt The Chain Rule for functions of several variables really has two parts, a theoretical part, which says that a composite function built out of differentiable functions is itself differentiable, and a practical part, which is a formula for computing partial derivatives of such a composite function. We shall postpone a precise statement and proof until Section 9. First we shall work intuitively in order to gain a feeling for how the Chain Rule is used in practice. Suppose z = f (x, y) where x = x{t) and y = y(t). Thus z is indirectly a 288 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS function of t. The Chain Rule asserts that dz dz dx dz dy dt dx dt dy dt This Chain Rule can be better understood in terms of vectors. The com posite function z(t) = f[_x{t), y{t ) ] can be thought of as z(t) = If t is time, then x (t) represents the path of a moving particle in the plane, and the composite function /[x (I)] assigns a number z to each value of t. The Chain Rule is a formula for the rate of change of/[x ()] with respect to t. For instance /(x) might be the temperature at position x. Then the Chain Rule tells how fast the temperature is changing as the particle moves along the curve x(). Chain Rule Let z = f (x, y), where x = x(t) and y = y{t). Then dz dz dx dz dy dt dx dt dy dt 1 dz dz where and are evaluated at (x(t )} In briefer notation, dx dy z = f xx + f vy. In terms of vectors, 2 = /[x()], ^ = / t * ( 0 > ( 0 + / [ > ( < ) ] ( 0 - Similar rules hold for functions of more than two variables. For instance, if w = f ( x , y, z) } where x = x( t ), y = y(t), z = z(t), then dw dw dx dw dy dw dz dt dx dt dy dt dz dt EXAMPLE 2.1 Let w = f (x, y, z) = xy2zz, where x = t cos t, y = e\ and z = ln(2 + 2 ). Compute at t = 0. dt Solution: There is a direct but tedious way to do the problem. Write w = (t cos t)e2t[ln(t2 + 2 )]3. Differentiate, then set t = 0. Thatfs quite a j ob! Use of the Chain Rule is much simpler: 2. Chai n Rul e 289 x0 = 0c(O), 2/(0 ), z(O)) = (0, 1, In 2 ). Since w = xy2z3, dw dx where the partial derivatives are evaluated at = (In 2 )3, dw y 2Z3 = 2 xyz3 Xo X0 dy X0 = 0, dw dz = 3 xy2z2 = 0, hence But w(0 ) = (In 2 )3x(0 ) + 0 + 0. x(0) = (cos t t sin t )|0 = 1, therefore w(0) = (In 2 )3. Answer: (In 2)3. Another Version of the Chain Rule Suppose z = f ( x , y ) , where this time x and y are functions of two variables, x = x(s, t) and y = y(s, t ). Then indirectly, z is a function of the variables s and t. There is a chain rule for computing dz/ds and dz/dt: Chain Rule If z = f ( x , y) is a function of two variables x and y, where x = x(s, t ) and y = y(s, t )} then dz dz dx ^ dz dy ds dx ds dy ds dz dz dx dz dy dt ~ dx dt + dy dt dz dz where and are evaluated at (x (s, t), y (s, t ) ). dx dy This Chain Rule is a consequence of the previous one. For instance, to compute dz/ds, hold t fixed, making x(s, t) and y(s, t) effectively functions of the one variable s. Then apply the previous Chain Rule. EXAMPLE 2.2 Let w = x2y, where x = s2 + t2and y = cos st. Compute dw ds 290 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS Solution: dw dw dx dw dy ds dx ds dy ds (2xy)2s + x2( t sin st) = 2 (s2 + t2) (cos st)2s + (s2 + t2)2( t sin st). Answer: (s2 + 2)[4s cos st t(s2 + t2) sin sQ* The next example is important in physical applications. EXAMPLE 2.3 If w = / ( x, y) } where x = r cos 0 and y = r sin 0, show that (>\ +()' = + 1 . / \ dy } \ dr J r2\d6 ) Solution: Use the Chain Rule to compute dw/dr and dw/dO: dw dw dx dw dy dw dw . A = ---------1---------= cos 6 H------ sin 6; dr dx dr dy dr dx dy dw dw dx dw dy dw . . , dw ---------- b ----- - = i r sin 6) -----r cos 6 d0 dx dd dy dd dx dy ( dw . dw \ ------ sin 6 + cos 6 ) . dx dy ) From these formulas Add: ( dw\ 2 ( dw\ dw dw . ( dw\ . ( ) = ( J cos26 + 2 -------sin 0 cos 6 + ( ) sm20, \dr / \ d x ) dx dy \ dy / 1 / dw\ 2 / dw\ 2 dw dw . / dw\ 2 - I ) = [ ) sin20 2 -------sm 0 cos 0 + ( ) cos2 0. r2 \^0 / \ dx / dx dy \ dy / 2. Chai n Rul e 291 EXERCISES Find dz/dt by the Chain Rule: 1. 2= exy; a; = St + 1, y = t2 2. z = #/?/; = + 1, y = ^1 3. z = x2cos y x; x = t2, y = 1/t 4. z = a?/y; * = cos t, y = 1+ 2. Find dw/eft by the Chain Rule: 5. w = xyz; x = t2, y z tA 6. w = e* cos(i/ + 2); x = 1/ 2, ?/ = t2, z = t 7. w e~xy2sin 0; x = t, y = 2t, z = At 8. w = (e~x sec ^)/?/2; x = t2, y = I t, z = tz. Find dz/ds and dz/d2by the Chain Rule: 9. z = z3/V; x = s2 t, y = 2st 10. z = (x + ?/2)4; x = se*, y = 11. z = \ / i + z2 + y*\ x st2>y 1 + si 13. The radius r and height Aof a conical tank#increase at rates r = 0.3 in./hr and h = 0.5 in./hr. Find the rate of increase V of the volume when r = 6 ft and h = 30 ft. 14. Prove that -( dX J g h(x) F(t) dt F[h(x)lh' (x) - F\ j , ( xW( x). q{x) 15. Given F (x, y), show that ~ F(u + v, u v) + F(u + v, u v) = 2Fx(u + v. u v). ou ov A function w = f i x, y, z) is homogeneous of degree n if f(tx, ty, tz) = tnf ( x, y, z) for all t > 0. The condition of homogeneity can be written vectorially: f(tx) = tnf (x). Show that the function is homogeneous: What degree? 16. x2+ yz 17. x y + 2z 18. x3 + ys + z3 - 3xyz 19. x2e~ylz 20. a . 1 x4+ y* + z4 a: + y * 22. Suppose/and <7are homogeneous of degree m and n respectively. Show that fg is homogeneous of degree mn. 23. Let f(x, y, z) be homogeneous of degree n. Show that f x is homogeneous of degree n 1. (Exception: n = 0 and / constant.) 24. Let f(x, y, z) be homogeneous of degree n. Prove Eulers Relation: xfx + yfv + zfz = nf. [Hint: Differentiate f(tx, ty}tz) = tnf(x, y, z) with respect to t, using the Chain Rule; then set t 1.] 292 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS 25. Verify Eulers Relation for the functions in Exs. 18 and 19. 26*. (Converse of Eulers Relation) Let /( x ) be differentiable for x ^ 0, and suppose x grad / = nf. Prove / is homogeneous of degree n. [Hint: Show that d[t~nf (tx)~\/dt = 0.] 3. TANGENT PLANE The graph of 2 = f i x, y) is a surface. Given a point x0 = (x0, y0, z0) on this surface, we are going to describe the plane tangent to the surface at x0. Let 2 = f (x, y) represent a surface and let x0 = (x0} y0) z0) be a fixed point on it. Consider all curves lying on the surface and passing through x0. Their velocity vectors fill out a plane through the origin. The parallel plane through x0 is called the tangent plane at x0. Let us see why this is so. Any curve on the surface (Fig. 3.1) is given by x(<) = ixit), y{t), 2 (0 ) = ix it), y it), f i x (0 , 2/(0 ])- Its velocity vector is found by the Chain Rule: vit) = x(<) = ixit), y i t ) , f xx + f vy). Let us assume time is measured so that x(0 ) = Xo, that is, z(0) = xo, 2/(0) = 2/0, 2(0) = 20. 3. Tangent Pl ane 293 Then v (0) = ((0), y (0), f x(x0, yo)x(0) + f y(x0) 2/0)2/ ( 0)) = x ( 0 ) (1, 0, f x(x0, y0)) + 2/(0) (0, l , f y(x0, yo))- The vectors Wi = (1, 0, f x(x0, y0)) and w2 = (0, l , f y(x0, yo)) depend only on the function / and the point x0, not on the particular curve (Fig. 3.2). Suppose a curve lies on the surface z = f (x, y) and passes through x0 = (x0, y0) zo). Then its velocity vector has the form v = awi + 6w2, where Wi = (1, 0,fx(xo, y0)), W2 = (0, l , f y(x0, yo)), and a and b are constants, a = x( 0 ) and b = 2/(0 ). Given any pair of numbers a and b, there are curves x(t) which lie on the surface, pass through x0 when t = 0, and have x(0 ) = a and 2/(0 ) = b. One such is x(t) = (xo + cit, 2/0 + bt,f(xo + at, yo + bt)). 294 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS Its velocity vector is awx + frw2. Therefore, all vectors awi + 6w2 actually occur as velocity vectors. It follows that the velocity vectors fill out a plane through the origin. The parallel plane through x0 is the tangent plane to the surface at x0. It consists of all points x = x0 + awi + frw2, a and b arbitrary. EXAMPLE 3.1 Find the tangent plane to 2 = x2 + y &t (1, 1, 2 ). Solution: Wl = (1, 0, ^ (1, 1)) = (1, 0, 2 ), w2 = (0, 1, ^ (1, 1)) = (0, 1, 1). dx dy Hence the typical velocity vector is v = awi + &w2 = a(l, 0, 2 ) + 6 (0, 1, 1) = (a, b, 2a + b). The tangent plane consists of all points (1, 1, 2 ) + (a, b, 2a + b) = (a + 1, 6 + 1, 2a + b + 2 ), where a and 6 are arbitrary. Answer: All points (a + 1, b + 1, 2a + b + 2). R e m a r k : The typical point on this tangent plane is x = (xyy, z) = (a + 1, b + 1, 2a + b + 2 ). Thus a = x 1, b = y 1, and 2 = 2a + b + 2 = 2{x - 1) + (y - 1) + 2 = 2x + y - 1. Consequently an equation for the tangent plane is 2 = 2# + y 1. Tfce Normal The vector n = ( f x j ~ f y , 1 ) is perpendicular to both tangent vectors Wi and w2: n* Wi = ( f x , f y , 1)* (1, 0, f x ) = f x + 0 + f x = 0, n- w 2 = ( - f x, - f y , 1)* (0, 1 , f y ) = 0 - / + / = 0. Consequently n is perpendicular to each vector awi + &w2: i t (awi + 6w2) = a( n* Wi ) + &( n* w2) = 0. 3. Tangent Pl ane 295 In other words, n is perpendicular to the tangent plane at x0, that is, n is a nor mal, so we may write a normal form of the plane. The tangent plane to the surface z = f(x, y) at the point x0 = (#o, yo, zQ) is given by - f x - (x ~ Xo) - f y - (y - yo) + (z - z0) = 0, where/xand/^ are evaluated at (#0, yo)- The vector form of this equation is (x - x0)-n = 0, n = ( - / - f y) 1). EXAMPLE 3.2 Find an equation for the tangent plane to z = x2 + y at (1, 1, 2). Solution: Write f (x, y) = x2 + y. Then fx(x, y) = 2x, f y(x, y) = 1. At (1, 1, 2 ), f x = 2, f y = 1. Hence the equation is 2(x - 1) - (y - 1) + (z - 2 ) = 0. Answer: 2x + y z = 1. The vector (f xy f V) 1) is perpendicular to the tangent plane and has a positive z-component (it is the upward normal rather than the downward normal). We introduce the unit vector in the same direction. The unit normal at x0 to the surface z = f (x, y) is the vector N = V J T T W n ( ~ fx ~ fv 1} where f x and/yare evaluated at (#0, yo)- EXAMPLE 3.3 Find the unit normal at (1, 1, 2) to the surface z = x2 + y. Solution: We have ( f x, f y, 1) = ( 2, 1, 1), hence N = ;..- 1 = ( - 2 , - 2 , 1). a / 22 + l 2 + l 2 Answer: N = -^= ( 2, 1, 1). EXAMPLE 3.4 How far is 0 from the tangent plane to z = x2 + y at (1, 1, 2 )? 296 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS Solution: The distance is the length of the proj ection of x0 = (1, 1, 2) on N. See Fig. 3.3. 0 Fi g . 3.3 Distance = |xo*N| (1, 1, 2) - - ^ ( - 2, - 1, 1) Answer: V 6 EXERCISES Give the equation z ax + by + c of the tangent plane to the surface at the indicated point: 1. z = x2 y2; x = 0, y = 0 2. z = x2y2; x = 1, y = 1 3. z = x2+ Ay2) x = 2, y = 1 4. z x2ey; x = 1, y = 2 5. z = x2y + ys; x = 1, y = 2 6. z = x cos y + y cos z; x = 0, y = 0 . Find the unit normal to the surface at the indicated point: 7. z x + x2yz + 2/; x = 0, y = 0 8. 2 = z3+ 2/3; * = 1, 2/ = - 1 9. z = x2+ xi/ + 2/2; x = 1, 2/ = 2 10. 2= v 1 - a:2 s/2; x = i , y =$.
4. Gradi ent 297
11. Show that the tangent plane to the hyperbolic paraboloid z = x2 y2at (0, 0, 0 )
intersects the surface in a pair of straight lines.
12. (cont J Show that the conclusion is valid for the tangent plane at any point of the
surface.
13*. (cont.) Show that the property of the tangent planes in Ex. 12 is also valid for the
hyperboloid of one sheet z2= x2+ y21.
4. GRADIENT
The gradient field of a function / on a region is the assignment of a certain
vector, called grad /, to each point of that region.
Suppose z = z(x, y) is defined on a region D of the y-plane. The gradient
of / is the vector
grad/ = (/*,/).
Likewise, if w = f (x, y, z) is defined on a region D of space, the gradient
of / is the vector
grad/ = (/*,/,/*).
The gradient field of a function / is the assignment of the vector grad /
to each point of the region D.
For example, if
f (x, y) = .r2 + y, grad/ = (2, 1);
if
/(, y, z) = |x|2 = x2 + y2 + 22, grad/ = (2x, 2y, 2z) = 2x.
In this section, we discuss several uses of the gradient. The first of these
concerns notation. Certain formulas are simplified if expressed in terms of
gradients. An example is the Chain Rule, which asserts
z = f xx + f uy
if z = f (x, y) and x = x(t), y = y(t). But grad/ = (/.,/) and x = (x, y).
Therefore, in vector notation
z = (grad/) x.
Level Curves
Imagine a surface z = f (x, y) above a portion of the x, y-plane. In the
plane we can draw a contour map of the surface by indicating curves of con
stant altitude. These are called contour lines or level curves. Figure 4.1
298 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
shows a contour map with two hills and a pass between them. Level curves
are obtained by slicing the surface z = f{x, y) with planes z = c for various
constants c. Each plane z = c intersects the surface in a plane curve (Fig.
4.2). The proj ection of this curve onto the x, y-plane is the level curve at
level c. It is the graph of f (x, y) = c. Where level curves are close together
the surface is steep; where they are far apart it is relatively flat.
peak
An important relation exists between the gradient field and the level
curves of a function.
The gradient field of / is orthogonal (perpendicular) to the level curves of /.
4. Gradi ent 299
For suppose a particle moves along the level curve f ( x yy) = c. Let its position
at time t be x(t) = (x(t), y(t )). Then
f l x( t ) , ?/(0] = C.
Therefore the time derivative is zero :
fxX + f yy = 0, (grad / ) x = 0.
Hence grad / is orthogonal to the tangent to the level curve.
Suppose we want an explicit equation for the tangent to a level curve
f (x, y) = c at a point x0 = (x0, y0). We know a point, x0, on the tangent and
we know a normal, grad/(x0), so the answer is
(x - x0)*grad/(xo) = 0.
EXAMPLE 4.1
Find the tangent to the cubic x2 = y3at ( 1, 1).
Solution: Set f (x, y) = x2 y3. Then the cubic is the level curve/ = 0.
We have
grad/(xo) = (2x, 32/2) | = ( 2, - 3 ) .
Hence the tangent is
[(* ,y)- (1, 1)D* (2, 3) = 0,
2(x + 1) - 3 (y - 1) = 0.
Answer: 2x + 32/= 1.
Level Surfaces
We cannot graph a function of three variables
w = f (x, y, *)
(the graph would be four-dimensional). We can, however, learn a good deal
about the function by plotting in three-space the level surfaces
f ( x , y , z ) = constant.
For example, the level surfaces of
f (x, y, z) = x2 + y2 + z2
are the spheres
x2 + y2 + z2 = c2
centered at the origin.
The gradient field of / is orthogonal to the level surfaces of /.
300 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
The proof of this statement is practically identical to the proof of the analogous
statement in two variables.
Just as for curves, the gradient provides a convenient tool for finding
explicitly tangents to level surfaces. Suppose we want the tangent plane at a
point x0 of the level surface/(x) = c. The answer is
(x - x0)*grad/(x0) = 0.
Note that this method fails at points where grad / = 0. At such points there
usually is not a clearly defined tangent plane anyhow.
EXAMPLE 4.2
Let A be a non-zero 3 X 3 symmetric matrix and let x0 be a point
on the quadric surface
xAxf = 1.
Find the tangent plane to the surface at Xo.
Solution: Let /(x) = xAx'. Then the quadric is the level surface
/ = 1, and since x0 is on the surface, x0Ax0' = 1.
Our problem is to compute grad/. To be definite, set
A =
a b d
b e e
A e / _
The terms involving x in xAx' are ax2 + 2bxy + 2dzx, so df/dx = 2 (ax +
by + dz). Similarly df/dy = 2 (bx + cy + ez) and df/dz = 2 (dx + ey + fz).
It follows that
grad/ = 2xA.
Therefore, the tangent plane at x0is given by
(x - x0)* (2x0A) = 0.
(Note that x0A ^ 0 since x0Ax0' = 1.) By the symmetry of A, the equation
can be written
(x x0)(x0A)' = 0, (x x0)Ax0' = 0, xAx0' = x0Ax0' = 1.
Answer: xAx0' ~ 1.
R e m a r k : This answer is particularly simple. You j ust replace one of the
xs in xAx' = 1 by x0. For instance, the tangent plane to the hyperboloid
2x2 + 3y1 4z2 = 1 at a point (x0, yo, Zo) on the surface is 2##o + Syy0
4zz0 = 1, by inspection.
4. Gradi ent 301
EXERCISES
Plot the level curves and the gradient field in the region \x\ < 3, \y\ < 3:
9. For each function in Exs. 5-8, find the gradient field.
10. Suppose z = f(r, 0) is given in terms of polar coordinates. Show that
where u = (cos0, sin0) and w = (sin0, cos0).
11. Find grad (r~2 cos 20).
Find the tangent line to
12. x2 3xy + y2 = 1 at (1, 2)
13. x + z3?/4 y = 0 at (0, 0)
14. y + sin xy = 1at (0, 1).
Find the tangent plane to
15. z=z2- 2/2at (1,0,1)
16. xyz = 1 at (1, 1, 1).
17. Let a be a constant vector. Find the gradient of /( x ) = a x.
18*. Let x0 be a point of the quadric xAx' + 2a x' = 1, where A is a 3 X 3 symmetric
matrix and a a vector. Assuming x0A + a ^0, prove that the tangent plane at
x0 is
Apply this to z = xy.
Let u = (u, v, w ) be a vector field (p. 306) on an open domain of R3with differentiable
coordinate functions. Define the divergence and curl of u by
19. div(/u) = (grad/) *u +/(divu)
20. div(/(p)x) = [p3/(p)]'/p2, where p2 = x2 + y2+ z2
21. curl(/u) = (grad/) X u +/(curlu)
22. curl (grad/) = 0
23. curl[/(p)x] = 0, where p2 = x2 + y2 + z2
24. div (curl u) = 0
25. curl (a X x) = 2a
26*. div[grad/(p)] = [p/(p)]"/p, where p2 = x2 + y2 + z2.
1. z = x 2y
3. z = x2 + y
2. z = x2 y2
4. z = x2 + Ay2.
Describe the level surfaces:
5. w = x y z
7. w xyz
6. w = x2 + 4y2 + 9z2
8. w = x2 + y2 z2.
gradz = fr u + - f e w,
r
xAx o' + a(x' + x0') = 1.
Prove:
302 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
5. DIRECTIONAL DERIVATIVE
Given a function w = f ( x, y, z) and a point j^in space, we may ask how
fast the function is changing at x in various directions. (A direction is indi
cated by a unit vector u.)
The directional derivative of f ( x, y, z ) at a point x in the direction u is
A i /00 = l / ( x + t o)
dt (=0
Think of a directional derivative this way. Imagine a particle moving
along a straight line with constant velocity u, passing through the point x
when t = 0. See Fig. 5.1. To each point x + tu of its path is assigned the
number
w( t ) = /(x + t u).
Fi g . 5.1
Then
Duf ( x ) = w ' ( 0),
the rate of change of w( t ) as the particle moves through the point x. For
example, suppose f ( x, y, z ) is the steady temperature at each point (x, y, z)
of a fluid. Suppose a particle moves with unit speed through a point x in the
direction u. Then D uf ( x ) measures the time rate of change of the particles
temperature.
5. Di rect i onal Deri vati ve 303
p(f) = x + tu.
Then wi t ) = /Cp(0], a composite function. By the Chain Rule,
There is a handy formula for directional derivatives. Let
Duf { x ) = w( 0) = ^ / [ P ( 0 ] = (grad/)*p(0 ) = (grad/)*u.
Since u is a unit vector, (grad/)*u is simply the projection of grad/ on u.
The derivative of / in the direction u is the projection of grad / on u :
Du /(x) = (grad/)*u.
I n particular if u = i = (1, 0, 0), then
A /(x) = (grad/)-(1, 0, 0 ) = (1 ) = r -
\ d x dy dz / dx
A similar situation holds for j = (0, 1, 0) and k = (0, 0, 1).
The directional derivatives of f ( x , y, z ) in the directions i, j , and k are the
partial derivatives:
df df df
A / - J - D ; f = T Z)^ = r -
dx dy dz
EXAMPLE 5.1
Compute the directional derivatives of f ( x , y , z ) = x y 2zz at
(3, 2, 1), in the direction of the vectors
(a) ( - 2,- 1,0) , (b) (5,4,1).
Sol ut i on:
D uf ( x ) = (grad/)*u,
where u is a unit vector in the desired direction, and grad / is evaluated at
(3, 2, 1). Since
df df df
= y 2z3, = 2 xyz 3, and = 3 xy2z2,
dx dy dz
grad/ = (4,12,36).
(3,2,1)
304 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
Thus
>u/(x) = (4, 12,36) -u.
(a) u = - ^ ( - 2 , - l , 0 ) ,
D uf ( x ) = (4, 12, 3 6 ) ~ (- 2, - 1, 0) = .
(k) u =
Du/(x) = (4, 12, 36)* 7= (5, 4, 1) = 104
V 42 V42
Answer: (a) ; (b) '
Question: I n what direction is a given function / increasing fastest?
I n other words, at a fixed point x in space, for which unit vector u is
D um
largest? Now
Du /(*) = (grad/) u = |grad/|cos0,
where 0 is the angle between grad/ and u. Therefore, the largest value of
Z)u/(x) is |grad/|, taken where cos 0 = 1, that is, 0 = 0.
The direction of most rapid increase of f ( x, y, z ) at a point x is the direction
of the gradient. The derivative in that direction is |grad/|.
The direction of most rapid decrease is opposite to the direction of the
gradient.
EXAMPLE 5.2
Find the direction of most rapid increase of the function
f ( x y y , z ) * #2 + yz
at (1, 1, 1) and give the rate of increase in this direction.
Sol ut i on:
grad/ = (2x, 2, y) = (2, 1, 1).
(i.i.i) (i,i,D
The most rapid increase is
D uf = |grad/| = a/22 + l 2 + l 2 = y / 6 ,
5. Di rect i onal Deri vati ve 305
where u is the direction of grad/:
= grad/ _ _1_ (2 i i )
I grad/| a/ 6
Answer: D uf
(1,14)
\ / 6 for u
V 6
(2, 1, 1).
We conclude this section with two examples involving directions of most
rapid change of a function.
Consider water running down a hill from a spring at a point P. See Fig.
5.2. The water descends as quickly as possible. What is the path of the stream?
From physics, change in kinetic energy [|m (speed)2] equals change in
potential energy [height]. Hence the speed of a water particle depends only
on how far it has descended (its altitude). Since the speed at a given time
does not depend on direction, the particle chooses the direction of steepest
descent (most rapid change of altitude). Let the hill be represented by the
surface z = f ( x, y) . Then water will flow in the direction of grad/, that
is, perpendicular to the level curves.
Next, consider a function z = f ( x, y ) and all curves x (t) = ( x ( t ) , y ( t ) ) in
the x, y-plane which pass through a fixed point x0with speed 1. Along which
of these is f ( x, y ) increasing the fastest at x0?
To find the direction of most rapid increase, write
z i t ) =/[x(0].
By the Chain Rule,
z = (grad /) x = (grad /) v,
306 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
where v is the velocity vector. But v is a unit vector; hence z is the direc
tional derivative in the direction of v (tangential to the curve x ( t ) ) . Thus
2 is greatest for those curves whose tangents at x0 point in the direction of
grad/; such curves are orthogonal to the level curves.
EXERCISES
Find the directional derivative of f(x, y, z) at Xoin the directions of vi, V2, and V3:
1. / = x + y + 2; x0= 0, vi = (1, 0, 0), v2 = (0, 1, 0), v3= (0, 0, 1)
2. / = xy + yz + zx; Xo= (1,1,1), Vi= (1,1,1), v2= (1, 1,1),
v3= (1, 1, 1)
3 . f = x y z ; * - ( 1 , - 1 , 2 ) , vx = (1,1,0), v2= (1,0,1), v , = (0,1,1)
4. / = xhfz*; Xo= ( - 1 , - 1 , - 1 ) , * - ( 1 , 2 , 3 ) , v2= (1, 1, 0), v3= (3,2,1).
Find the largest directional derivative oi f ( x, y , z) at x0:
5 . f = x 3+ y2+ z; x0 = 0 6. f = xyz; Xo= (1, 1, 1)
7. f = x * + y2+ z2; x0= (1,2,2) 8. / = z2- y2 + \ z \ x = ( - 1 , - 1 , 1 ) .
6. APPLICATIONS
A vector field is the assignment of a vector F (x) to each point x of a region
D in space. (The gradient of a function is one example.)
Let F(x) be a vector field on D and suppose x (?) is a path in D from
x0 = x (l0) to Xi = x( t i ) . The line integral
L
Xl
F*dx
X0
is defined over this path and is computed by the ordinary integral
^1
f
J to
F[ x (0] #x(0 dt.
I ts value generally depends on the path x( t ) connecting x0and Xi.
EXAMPLE 6.1
Let F(x) be the vector field F(x) = (x, y, x + y + z) . Let
x0 = (0, 0, 0 ) and Xi = (1, 1, 1). Compute the line integral
[ X i
/ Fvfa over the paths
J X o
(a) x(t ) = ( t , t , t ) , (b) x ( t ) = ( t , P , P ) .
Sol ut i on: Notice that for both paths, x0 = x (0) and Xi = x ( l ) .
(a) On this path x = (, t} t) and x{ t ) = (1, 1, 1).
f* F' dx = [ F[x(<)]*x(<) dt = [ ( t , t , 3 t ) ' ( l , 1 , 1 ) dt = [ 5 t d t = ~ .
xo J o J o J o 2
(b) This time x = (t, t2, ts) and x( t ) = (1, 2t, St2).
6. Appl i cati ons 307
I n an important special case, the line integral does not depend on the
path x( t ), but only on the initial and terminal points x0 and Xi:
If the vector field F(x) is the gradient of some function/,
Thus the value of the line integral is independent of the path connecting
x0 and Xi.
This assertion is easily verified by means of the Chain Rule,
^ ( t + 5t > + W + 3 t > ) d t = + 5 + | + I = |
5 57
Ans wer : (a) - ; (b) .
F = grad/,
then
~/Ex(0 ] = (grad/)* x.
If F = grad/, then
308 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
R e m a r k : Not every vector field is the gradient of some function. For
example, the field F(x) = (x, y, x + y + z ) is not a gradient;
r F-dx
J Xo
is not independent of the path (see Example 6.1).
Appli cati on to Physics
If F = grad / is a force, the net work done by this force in moving a particle
from x0 to Xi is/(xi) /(x0), independent of the path. The function/, which
is unique up to an additive constant, is called the potential of the force.
An important example is that of a central force subject to the inverse
square law, for instance, the electric force E on a unit charge at x due to a
unit charge of the same sign at the origin. The magnitude of the vector E is
inversely proportional to |x|2. I ts direction is the same as that of x. See Fig. 6.1.
The unit vector in the direction of x is
x x
= - , p = X .
1*1 P
Therefore, expressed in suitable units,
_ 1 x x
p2p p3
The force field E is defined at all points of space except the origin. We
shall prove that E is the gradient of a function, in fact that
E = grad/, where f ( x, y, z ) = - - =
p \ / x 2 + y 2 + z2 '
6. Appl i cati ons 309
Let us compute the gradient of / = 1/p:
df x
and similarly
Therefore
dx (x2 + y 2 + z2) 312 p3 5
df_ = y_ df = z_
dy p3 dz p3
Since E = grad/, it follows from our discussion of line integrals that
L
E- dx = /(xO - /(xo) = 777 - ITT
Xo |Xo| |Xi|
The right-hand side is the potential difference or voltage. I t represents the
work done by the electric force when a unit charge moves from x0 to Xi along
any path.
If Xi is far out, then l/|xi| is small, so
L
X, I
E' dx -- .
|xo|
As Xi moves farther out, the approximation improves. I n mathematical
shorthand,
L
xi i
E *dx -------->7: as |xi|
X. Xo
or
E *dx = 7~
Xo
/
J xo
Physical conservation laws are usually derived by identifying an ap
propriate vector field with the gradient of a function and then evaluating a line
integral.
EXERCISES
f (0,-1)
1. Compute / (xy, 1 + 2y)*dx
y (o,i)
(a) along the straight path,
(b) along the semicircular path passing through (1, 0).
310 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
[ Q X D
2. Let F = (3x2y2z, 2x*yz, x*y2). Show that / Fdx is independent of the path, and
evaluate it. J(o,o,o)
[ ( a, b, c)
3. Let F = (x2 + yz, y2 + zx, z2 + xy). Show that / *dx is independent of the
path, and evaluate it. y (o.o.o)
v x \
2 , 2, 2 , 2) .
x y x y j
5- Find
r
r2/ dx + x dy .
^ over the circle x = a.
' + y2
6. Let F = x/|x|5 and suppose 0. Show that I F*dx, taken along any path
from a which does not pass through 0 and goes out indefinitely, depends only on
a = |a|. Evaluate the integral.
7. (cont.) Do the same for F = x/|x|n, for any n > 2,
8. (cont.) Show that F = x/|x|2is a gradient.
7. IMPLICIT FUNCTIONS
Often a function y = g( x) is defined only as the root of an equation
f ( x, y ) = 0,
which may be hard or impossible to solve explicitly. I n such a case, the
equation is said to define an implicit function y = g( x) . For example, Fig.
7.1 shows part of the graph of
?/6 + y + xy x = 0.
Near the origin, this equation defines y as an implicit function of x. (I t is
hopeless to express y as an expl i ci t function of x. )
Fi g . 7.1
7. Impl i ci t Functi ons 311
What is the derivative of an implicit function y = g( x) defined by
f ( x, y ) = 0? Substitute y = g( x) :
f i x , g{x)~\ = o.
Differentiate with respect to x using the Chain Rule:
/
fx + fy-g' - 0, g' -
f y
I f y is an implicit function of x defined by
f ( x, y ) = 0,
then
dy
dx
f x( x9y)
f y( x, V )
at each point (x , y) where f ( x, y ) = 0 and/^Or, y ) ^ 0.
Differentiation of implicit functions is called implicit differentiation.
R e m a r k : The minus sign in this formula may appear puzzling. I t seems
to contradict the natural procedure of canceling differentials. For instance, if
g = g[_y(x)~\, the Chain Rule says dg/ dx = (d g / d y ) (<d y / d x ). Hence
dy dg/ dx
dx dg/ dy
so dg appears to cancel out.
The reason for the minus sign is this. When we wri te/(x, y ) = 0, we have
taken y to the other side. The equation y = g( x) is equivalent to f ( x, y ) =
y ~~ 9( x ) = 0. Now canceling differentials fails because
fx _ - g ' j x ) _ _ ,
fy
There is the minus sign!
EXAMPLE 7.1
Find
ax
where y = g( x) is defined by y* + y + xy x = 0.
<0,0)
Sol ut i on:
f ( p>y ) = y* + y + xy -
fx = y l, f v 6y5 + l + X.
Hence
dy f x
dx f y 6i/5 + 1 + x
312 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
At (0, 0),
dy_
dx
- 1
= 1.
Al t ernat e Sol ut i on: Differentiate the equation
y* + y + xy - x = 0,
treating ?/asa function of x :
a b dy dy dy
+ + + y - 1 = 0,
dx dx dx
y - 1 dy = ____________
dx 6t/5 + 1 + x
Answer: = 1.
dx
(0,0)
R e m a r k : The technique in the alternate solution is equivalent to use
of the rule
dy = _ f x
dx f y
because the rule was derived by that very technique.
EXAMPLE 7.2
Let y = y / l x2. Compute y' and y" by differentiating implicitly
x2+ y2 1 = 0.
Sol ut i on: Differentiate:
2x + 2yy' = 0, y' = - - .
y
Differentiate again:
y - x
y" = ~
since x2 + y2 = 1.
y - xy
(-3
+ X2 1
r
Ans wer :
x
y f - -
y
%
1
*

1

>
l - 1
y *
y3
(1 - X2)3/2'
7. Impl i ci t Functi ons 313
Applications
Implicit differentiation is useful when a function must be maximized or
minimized subject to certain restrictions.
EXAMPLE 7.3
A cylindrical container (right circular) is required to have a given
volume V. The material on the top and bottom is k times as ex
pensive as the material on the sides. What are the proportions of
the most economical container?
Sol ut i on: The cost C of the container is proportional to
(area of side) + /c(area of top + area of bottom).
Let r and h denote the radius and height of the container. I n the proper units,
C = 2t rh + k(2i rr2),
into the equation for C. Then C is an explicit function of r which can be
minimized.
I t is simpler, however, not to make the substitution, but to consider C
as a function of r anyway (as if the substitution had been made). Differentiate
implicitly:
I t is easily verified that C is minimal for h = 2kr. Since t r2h is constant,
h is large if r is small and decreases as r increases. Therefore (2kr h) in
creases from negative to positive as r increases. Thus dC/ dr satisfies the condi
tions for C to have a minimum at h = 2kr.
We must minimize C subject to this restriction.
One approach is obvious: solve the last equation for h and substitute
Now differentiate the equation for V with respect to r:
Substitute this value of dh/ dr into the preceding equation:
dr L \ r /
Hence
dC r / - 2/A 1
= 2t r I ----- ) + h + 2kr = 2n( 2kr h).
when h = 2kr.
Answer: height = 2k X radius.
314 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
R e m a r k : The special case k = 1 is interesting. All parts of the cylinder
are equally expensive; the cheapest cylinder is the one with least surface area.
Conclusion: Of all cylinders with fixed volume, the one with least surface
area is the one whose height is twice its radius.
EXAMPLE 7.4
Find the greatest distance between the origin and a point of the
curve x4 + y4 = 1.
Sol ut i on: Draw a graph (Fig. 7.2). Because of symmetry we need
consider only x > 0 and y > 0. Since the curve lies outside of the circle
x2 + y2 = 1, the maximum distance is greater than 1 and occurs at some point
(x, y ) where x > 0 and y > 0.
The square of the distance from any point (#, y ) to the origin is x2 + y2.
Hence, we must maximize
L 2 = x2 + y2
subject to
x4 + y 4 1 = 0.
Differentiate both relations with respect to x :
j
(L2) = 2x + 2 yy' , 4z3 + 4 y 3y' = 0.
dx
7. Impl i c i t Functi ons 315
I t follows that
This derivative vanishes in the first quadrant only for x = y. Hence the
maximum distance occurs at the point (x, y ) of the curve in the first quadrant
for which x = y. Thus
x4 + y 4 = 1, x = y,
from which
" = 2/ = ^ l
and
Ans we r : \ / 2 .
EXERCISES
Find dy/ dx:
1. x + y = x sin y 2. x2 + ys = xy
3. exy 3xy2 4. x4 y4 = 3x2y*
5. ex sin y = ey cos x 6. x*y* = x2 y2 + 1
7. x4 + Sye = 1 8. x5 + yb = xy + 1.
9*. Find the maximum and minimum values of fi x, y, z) = x4 + y4 + z4 on the
surface of the unit sphere x2 + y2 + z2 = 1.
10*. (cont.) Deduce that ^(x2 + y2 + z2)2 < x4 + y4 + z4 < (x2 + y2 + 22)2 for any
(x, V, z)-
11*. (cont.) Find the corresponding inequalities for n > 3 relating xn+ yn + znand
x2 + y2 + z2. (Assume x > 0, y > 0, z > 0 if wis odd.)
12*. (cont.) Find the largest A and the smallest B so that
A (x4 + y4 + z4)3 < (^6 + 2/6 + 26)2 < B (x4 + y4 + z4)3 for all (x, y , z).
13. Suppose z(x, i/) is defined implicitly by F( x, y, z) = 0. Assuming Fz (x, y , z ) 0>
prove
dz_ Fx[ xf y, z ( x, y) ' ] dz = Fy[x, y, z(x, y) J
dx Fz[_x, y, z(x, y)~\ 7 dy Fz[x, y, z(x, y)~\ *
\_Hint: Use the Chain Rule.]
14*. (cont.) Suppose z ( x, y ) and w( x, y ) are defined implicitly by
lF(x, y, z , w) = 0
[G(x, y } z , w) = 0.
Find formulas for dz/dx, dz/dy, dw/ dx, dw/ dy under suitable hypotheses.
316 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
8. DIFFERENTIALS
We shall re-examine our basic approximation formula
/ ( c + x ) = / ( c ) + k -x + e( x)
with error e (x). We assume that the domain D of / is an open set and that / is
differentiable at every point c of D. Hence
Suppose we allow both c and x to vary. I t is customary to replace c by x
and x by dx = (dx, dy, dz) , a new quantity. I n this notation,
/ ( x + dx) = / ( x ) + k *dx
The boxed quantity is particularly significant. For each x it is a linear function
of dx. We therefore give it a name, df, the di f ferent i al of /, and write
, , df , , df df
df = dx + dy + dz.
dx dy dz
The differential is really a function of the six variables x, y, z, dx, dy, dz,
where x is confined to D and dx is arbitrary. We could even write df = df (x, dx)
but this is a cumbersome notation.
The differential has elementary algebraic properties that correspond to
analogous results for derivatives:
d ( f + g) = df + dg d( af ) = a df
d ( f g ) = W) g + f dg d ( f / g ) = 2 g ^ 0.
Substitution
The differential has a useful formal property of remaining unchanged when
the variables are replaced by functions of other variables. This is another
form of the Chain Rule.
For instance, suppose / = f ( x, y, z ) so
df = f xdx + f y dy + fz dz.
Now suppose x = x ( u , v ) , y = y ( u , v ) , z = z ( u , v ) . We can look at the
composite function f [ x (u, v)^\ and compute its differential, another d f f :
df = f u du + f v dv.
8. Di f f er ent i al s 317
Is this different? No, because by the Chain Rule
fu du + f v dv = (f xxu + f yy u + f zZu) du + ( f xx v + A?/* + /*z.) dv
,/x du I x v dv) I /y (2/w I y v dv) -j- f %(Zu du I z dv)
= f xdx + f y dy + fz dz .
This may appear as simply a consequence of sloppy notation for composite
functions. Still there is something more to it.
Suppose we have a function / of independent variables u, v, but ther^are
some intermediate variables in the way. After some computation we arrive at
an expression
(1) df = M d u + Nd v .
Then we know automatically that M f u and N =/ , no matter how we ob
tained (1).
For example, suppose z = z( x, y) , and we know that
x2 + y 2 + z2 = 1.
Then, since the differential of a constant function is 0,
x dx + y dy + z dz 0,
hence
dz = - dx - dy.
z z
We conclude that dz / dx = x / z and dz / dy = y/ z .
More generally, if z = z (x, y ) is constrained by a relation
F( x, y , z ) = 0,
then
Fx dx -f- Fy dy -f- Fz dz = 0,
which implies
7 7 Fy
dz = ~ y T Vy
whenever Fz ^ 0 at (x, y, z( x, y ) ) . Therefore
dz Fx dz _ Fy
dx Fz dy Fz
Here is a situation where an intermediate variable is in the way. Suppose
z = z( x, y) is subject to two constraints,
F( x, y, z, u) = 0
G(x, y, u) = 0.
If we could solve the second relation for u and substitute the solution in the
first, the result would be a relation H (x, y, z ) = 0. But that might be im
practical to carry out, so we proceed indirectly by forming differentials:
Fx dx -|- Fy dy -j- Fz dz -|- Fu du = 0
Gx dx -f- Gy dy -f- Gu du = 0.
Now we eliminate du:
(GUFX FuGx) dx -f- (GuFy FuGy) dy + GUFz dz = 0.
At points where Gu ^ 0 and Fz ^ 0, we have
dz GUFXFUGX dz GuFy FuGy
dx GUFZ dy GUF z
Numeri cal Approximati ons
The approximation
f ( x + dx) & f ( x ) + df
can supply quick numerical estimates.
EXAMPLE 8.1
Estimate (2.01 )0-98.
Sol ut i on: Set f ( x, y ) = xv. Then
I n/ = y In x
= - dx + (In x) dy.
f x
Set x = 2, y = 1, dx = 0.01, dy = 0.02. Also use the estimate In 2 ^ 0.69.
Then / (2, 1) = 2 and
5 (0.01) - (0.69)(0.02) = -0.0088,
df -0.0176, f t t 2 + d f t t 1.9824.
Answer: 1,9824,
R e m a r k : By 4-place logs, (2.01 ) 0-98 1.982.
EXERCISES
318 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
Compute dz :
5. Proof of the Chai n Rul e 319
3. x = z2- y2 4. ze~xy + xe~yz = 0.
5. For x = r cos 0, y r sin 0, prove r dr = x dx + y dy and x dy y dx = r2dd.
6. Let z = z ( x, y) , p = dz/ dx, q = dz/ dy, r = d2z/ dx2, s = d2z / dxdy, 2= d2z/ dy2.
Express dz, dp, dq in terms of dx, dy.
7. Let / be homogeneous of degree n, that is, f(tx, ty, tz) tnf(x, y , z ) for t > 0-
Compute d of both sides of this relation and equate coefficients of dx, dy, dz, dt-
What results? (Compare Ex. 24, p. 291.)
8. Let z z(x, y) be defined by
x2 + y2 + z2 = t2
y = tx.
Find zx and zy.
If v = v(s, t) = (f(s, t), g(s, t), h(s, t ) ) is a vector function of several variables, then
we define dv by dy = (df, dg, dh) = vsds + vt dt.
9. Consider the vectors x = (x, y), u = (cos 6, sin6), w = ( sin0, cos0) of
Chapter 5, Section 7, p. 193. Prove that
du = w dd, dw = u dd,
dx = u dr + rw dB.
Estimate using differentials:
10. 5.1 X 7.1 X 9.9 11. V (5.99 )2 - (3.02 )3.
- R3. I n column
9. PROOF OF THE CHAIN RULE [optional]
Vect or-val ued Functions
I n Chapter 7 we discussed linear functions L: R3-
vector notation, each linear L has the form
L ( x ' ) = Ax' ,
where A is a 3 X 3 matrix. I t will be convenient throughout this section to
think of R3as the space of column vectors, even when we deal with more general
functions, not necessarily linear.
We shall consider functions F: D --------> R3, where D C R3. Such a function
assigns to each x' in the domain D a column vector F ( x ' ) in R3. We may write
7 i ( x T
F( x' ) =
/ 2(x ' )
L/3(x') J
320 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
Thus the vector-valued function F is equivalent to a triple /i, / 2, fz of real
valued functions, the components of F.
Our first chore is to say when a vector-valued function is differentiable.
Di f f erent i abl e Vect or-val ued Function Let F: D --------- > R3, where D C R3.
Then F is differentiable at a point c' of D if there are a matrix A and a
vector-valued function E such that
(i) F ( c' + x') = F ( c ' ) + Ax' + E{ x ' ) whenever c' + x' D,
(ii) r~ E (x') -------->0 as x'-------->0.
I x |
I t is understood in (ii) that c' + x' D when x'-------->0.
We want to identify the matrix A in (i). I f we compare this situation with
that of a differentiable/: R3--------- R, as discussed in Section 1, we suspect
that A must involve partial derivatives. That this is so will fall out of the
following characterization of differentiability.
Theorem Let F: D --------- > R3 and let /i , / 2, fz be the components of F, so
f i i D --------- > R. Then F is differentiable at a point c' if and only if /i, / 2,
and/3 are differentiable at c'.
Proof: Suppose F is differentiable. Write
7 i ( x ' )
ai ~ei (x' )~
(1) F(X' ) = /s (x ' ) , a = a2 , E( x ' ) = e2(x')
_/3(x' )_ _ a 3_
_e3(x ' )_
where ai , a2, a3 are the rows of A. From
(2) F( c' + x') = F( c' ) + Ax' + E( x ' )
we have
(3) f t (c' + x ' ) = f i ( c ' ) + atx' + 6t (x ' ).
The vector function |x'|~1J57(x') -------->0 as x'-------->0, hence each of its
components approaches 0, that is, |x' |_1 e ; ( x ' ) --------> 0. Hence by (3), each /*
is differentiable.
Conversely, if each / t- is differentiable at c', then (3) holds and
|x' |_1 e i { x ' ) -------->0 for i = 1,2,3. Consequently (2) holds with A and
E{ x ' ) as in (1); furthermore jx'|_1 E (x')---- > 0 as x'-------->0 because each
component of Ix'!-1 l (x ' ) approaches 0. Therefore F is differentiable.
9. Proof of the Chain Rul e 321
Theorem Let F be differentiable at an interior point c' of its domain and
let
F( c' + x') = F ( c ' ) + Ax' + E (x')
where I x'h1E (x') 0 as x'
A =
>0. Then
dA
&
d_h
dXi dXi dx3
d_h
d_h
dfr
dx\ dx2 dx3
dh dh
dfs_
dXi dXi dXi
where fi, f 2, fs are the components of F and all partials are evaluated at c.
This is an immediate consequence of the corresponding result for the
functions /: R3--------- > R and the previous theorem.
The matrix A written above is called the J acobian matrix of F. I t is
important in the study of transformations of regions of space and in change of
variables in multiple integrals.
Chain Rule
We first consider a composite function h = f G, where G is a differentiable
function on a domain S of u-space into x-space and/is a differentiable function
on a domain D of x-space into R. We assume G sends S into D. We want to
prove that h is differentiable and to compute its partials. See Fig. 9.1.
Fi g . 9.1 composite vector function
Let us experiment with the special case in which both / and G are non-
homogeneous linear:
/ ( x ' ) = d + kx', Cr(u') = b' + Au'.
Then
h( u') = f [ G( u' ) ] = d + kb' + kAu'.
For any c',
h( c' + u') = d + kb' + kA(c' + u') = h( c') + kAu'.
Therefore h is also non-homogeneous linear and
/ dh_ dh_ d h \ = kA = df_
\ dl t i du2 1duz j \ d # i dx2 dxj \ _dUj \
With this easy case as a guide, we are ready to formulate the general result.
322 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
Chain Rule Let / : D ---------> R be defined on a domain D in x-space,
differentiable at a point a'. Let G: S --------> R3 be defined on a domain S
of u-space into x-space, differentiable at a point c' such that G( c ' ) =
a'. Assume G( u' ) D whenever u' S.
Then the composite function h = f G is differentiable at c'.
If
/(a' + x') ^/ ( a') + kx' and G( c' + u') ^ G( c ' ) + Au',
then
h ( c' + u') h ( c') + kAu'.
Finally, if a' and c' are interior points of D and S respectively, then
dh( c' ) = y a/(a') a^(c')
i=i
for j = 1, 2, 3, where the gi are the components of G.
Proof: By hypothesis,
(1) / ( a ' + x ' ) = / ( a ' ) + kx' + e(x ' )
(2) G( c' + u') = G(c') + Au' + E ( u'),
where
^fx' ^
(3) yT p--------0 as x'-------->0
lXI
(4)
-------->0 as u'-------->0.
I** I
9. Proof of the Chai n Rul e 323
Substitute:
h( c' + u') = /[(?(c' + u' ) ]
= / [ G( c ' ) + Au' + ( u ' ) ]
= / [ ^ ( c ' ) ] + k [Au' + t f ( u' ) ] + e[Au' + # ( u ' ) ] ,
hence
(5) h( c ' + u') = h( c' ) + kAu' + {kE'(u') + e[_Au' + i?(u')]}.
We must prove that the term in braces goes to zero faster than u', that is,
kE (u') + e[_Au' + E( u ' ) ]
*0 as u' -+0.
For then (5) says that h is differentiable at c' and that h (c' + u') ^ h (c') +
kAu'.
Clearly
k E( u' )
by (4). I t remains to prove that
e\_Au' + -E(u')]
E( u' )
lu'l K
0 as u'
->0
0.
This is the crux of the proof, and it requires some care and patience.
First, by the triangle inequality
\Au' + E( u' ) \ < \Au'\ + \ E( u' ) \ .
We proved in Chapter 7, Section 4, that |Au'| < bi |u'| where bi is a constant.
Wealsohave \ E( u' ) \ < |u'| for |u'| sufficiently small since |u'|-1l ( u ' ) --------->0
by (2). Therefore
\ Au' + E( u' ) \ < 6 lu'l,
where b = bi + 1 is a constant.
Now write e(x ' ) = |x'| ei (x ' ), where by (3) we have ei (x ' ) -
x ' --------- >0. Then
|e[Au' + E ( u' )]| = |Au' + E( u' ) \ ^ [ Au ' + E { u' )]|
< 6 |u'| \ e, t Au' + E( u ' ) J .
But Au' + E (u')-------->0as u'--------> 0, hence
e[Au' + E(u' )~\
- 0 as
< b \ei[_Au' + 2?( u' ) ] | -------->0
as u ' --------->0. This completes the proof of the first part of the Chain Rule.
324 8. SEVERAL VARIABLE DIFFERENTIAL CALCULUS
I f c' and a' are interior points, then
^ 1
w
at-2: <32-3/ 1
and A =
dg 1 dg i
awi dU2 a^3
I
C
D

1
<Hj2 dg%
awi du2 du*
tty? dg* dg*
awi dU2 dl l s
c '
We have proved that
h(c' + u ' ) = ft(c') + kAu' + E (u' )
where |u' |~1E ( u ' ) --------- >0 as u ' --------- >0. Hence the elements of the row
vector kA are the partials of h at c'. But the ^-th element of kA is precisely
as required.
X
d l d Qi
dXi dUj
i = l
Composite of Vect or-val ued Functions
We next consider a more general form of the Chain Rule. I t turns out to be
an easy consequence of the previous one.
Theorem Let F : D--------> R3 and G : S--------R3, where G(u') D for
each u' S. Suppose that G( c ' ) = a' , that G is differentiable at c', and
that F is differentiable at a' . Then the composite function H = F G is
differentiable at c'. If
F( x' ) F( a') + Ax' and <?(u') G(c') + u',
then
ff(u') (c') + (Afi)u'.
If a' and c' are interior points, then
hi (O' ) = y a/,(a' ) agy(c')
a^ -w a#/ at/fc
j =i
Proof: Write
" M u'f
7i (x')"
H ( u') = /l2(u') and F(x') =
/*(x')
J a(x')_
9. Proof of the Chai n Rul e 325
Then hi = f i G because H = F G. Since F is differentiable, each / is
differentiable. Hence, by the previous theorem, each hi is differentiable and
3
dhj V dft dgj
dUk L j dXj dUk
j = l
in the case of interior points. But this implies H is differentiable since a
vector-valued function is differentiable if and only if its component functions
are differentiable.
R e m a r k : The theorem asserts that the J acobian matrix of the composite
function F 0 G built from F and G is the matrix product A B of their respective
J acobian matrices A and B.
Ot her Cham Rules
The definitions, theorems, and proofs in this section can be modified with
little trouble to cover all useful versions of the Chain Rule. The most general
case might be indicated schematically as follows
G
R m--------> Rn
\
\
H = F G \
\
F
Rp.
Here F and G are differentiable on suitable domains. Then H is differentiable
and
n
dhi V s d hi dgj
dUk L j dXj dUk
i =1
at interior points. The long proof starting on page 322 covers the case p = 1
almost verbatim, and the last proof above extends to arbitrary p.
9. Higher Partial Derivatives
1. MIXED PARTIALS
Differentiate a function f ( x , y ) of two variables. There are two first
derivatives,
f x( z , y ) and f y { x , y ) ,
each itself a function of two variables. Each in turn has two first partial
derivatives; these four new functions are the second derivatives of f { x , y ) .
Figure 1.1 shows their evolution:
f
a j d
dx
dy 1
_a__|____ d_
dx dy
f x x f x y f y x f y y
F i g . 1.1
The pure second partials
f x x and fyy
represent nothing really new. Each is found by holding one variable constant
and differentiating twice with respect to the other variable.
Alternate notation:
_ ay ay
Jxx d x 2 , Jyy d y 2 -
For example, if f ( x, y) = x3y 4 + cos oy,
f x = 3 x Y , f y = - 5 sin oy,
f x x = 6 xy \ f y y = ! 2 x Y - 25 cos 5y.
a | a
dx dy
1. Mi xed Par t i al s 327
u (*), *l and
dy \ d x / dy dx dx \ d y / dx dy
are new. The mixed partial f xy measures the rate of change in the ^/-direction
of the rate of change of / in the ^-direction. The other mixed partial f yx mea
sures the rate of change in the x-direction of the rate of change of / in the y-
direction. I t is not easy to see how, if at all, the two mixed partials are related
to each other.
Let us compute the mixed partials of the function / (x, y ) = x*yA+ cos 5y:
f x = 3 x2y \ fxy = 3 -4zy.
f y = 4x*y3 5 sin 5y , f yx = 4-3#y + 0.
The mixed partials are equal! This is not an accident but a special case of a
general phenomenon, true for functions normally encountered in applications.
The mixed second partials
Theorem Let /(#, y ) be defined on an open set D. If f xy and f yx exist at
each)point of D and are continuous at (a, 6), then
W at (a, 6).
dx dy dy dx
Proof: We consider the mixed second difference
(1) A = [/(a + h, b + k) - /(a + h, 6)] - [/(a, b + k) /(a, 6)]
and its alternate form
(2) A = I f (a + h, b + k ) - /(a, b + &)] [/(a + A, b) - /(a, &)].
We shall apply the Mean Value Theorem (twice) to (1). To do this, we set
g( x) = f ( x , b + k) - f ( x , b ).
Then
A = g ( a + h) - g( a) = hg' {xi),
where xi is between a and a + h. Next,
g' 0 0 = f x(xi, b + fc) - f x(xi, b) = kfxy(xi, yi ) ,
>
where yi is between b and b + k. Hence
(3) A = hkfxy(xi, y x).
We apply similar reasoning to the second expression for A to obtain
(4) A = hkfyX(x2, 2/2 )>
where x^ is between a and a + h and y 2 is between b and b + k.
328 9. HIGHER PARTIAL DERIVATIVES
Now we take h = k 5* 0. By (3) and (4),
(5) f x y ( % l , y i ) = y %)
Let h --------> 0. Then (#i, y \) -------- (a, b) and (%2, y i ) --------* (a, b ) . Since
f xy and f yx are continuous at (a, b), we deduce from (5) that
f x y b ) = f y x ( d , b ).
EXERCISES
d2/ d2/
Compute - and - ; verify that they are equal:
ax ay ay ax
1. x*yb 2. xy6
3. x/ y2 4. x + z32/ + yA
5. sin (x + y ) 6. cos (xy)
7. ex,y 8. arc tan (a; + 2?/)
9. xw2/n 10. g(x) + /&(?/)
11. xy 12. y*
13. 14. 2X2/
x y x2 + y2
x y
1 + xy
15. ~~ 16. (x y) ( x 2y) (x Sy)
1 x2+
17. -7-------T7----- s - t 18. ^
(x i/)(x 2y) *t/2+ z
19. ax2 + 2bxy + cy2 20. sinh(x + y2).
21. Show that each function of the form/(a;, y) = g(x y) + h(x y) satisfies the
d2f d2f
partial differential equation 2 2 = 0.
22. Show that each function of the formf ( x , y ) g(x) + h(y) satisfies the partial
d2f
differential equation - = 0.
ax ay
23. Prove the converse of Ex. 22: each solution of f xy = 0 has the formf ( x , y ) =
g(x) + h(y).
2. HIGHER PARTIALS
A function z = f ( x, y) has two distinct first partials:
2. Hi gher Par t i al s 329
and three distinct second partials:
dh d2z d*z
dx2 y dx dy dy2
Because the mixed second partials are equal, so are certain mixed third
partials. For example,
/ d z \ _ d_ / d h \
J dy \ d x 2)
( d z \ _ d_ l d _ /dA ] _ d_ f d_ /dz\l _ f d z \ ] _ d_ / & z \
\ d y ) dx [ dx \ d y )J dx |_dy \ d x )J dy [_d \ d x )J dy \ d x 2)
dx2 \ d y /
since
dx2 '
Thus there are precisely four distinct third partials:
dh d3z d3z dh
dx3 * dx2 dy 1 dx dy2 1 dy3
In general there are n + 1distinct partials of order n :
dnz
dxk dyn~
k = 0, 1, 2, , n.
EXAMPLE 2.1
Find all functions z = f ( x, y) which satisfy the system of partial
d2z d2z d2z
differential equations - = 0, ------ =0, - = 0.
dx2 dx dy dy2
Solut ion: If both first partials of a function are 0 everywhere, then the
function is constant. (Why?) This applies to the function dz/ dx since
0 and U * ) -
dx \ d x / dx2 d\J \ d x /
d2Z d2Z
= 0.
dy dx dx dy
Hence
= A.
dx
Integrate with respect to x, holding y constant:
z = Ax + g( y ) ,
where the constant of integration is an arbitrary function of y. Since
d h / d y 2 = 0 ,
Consequently g( y ) is a linear function,
g ( y) = By + C.
330 9. HIGHER PARTIAL DERIVATIVES
Answer: z = Ax + By + C.
EXAMPLE 2.2
Find all functions z = f ( x , y ) whose third partials are all 0.
Solut ion: The second partials of dz / dx are all 0. By the last example,
I ntegrate:
But
= Ax + By + C.
dx
z = - Ax 2 + Bxy + Cx + g( y) .
Zi
0 = = ^
dyz dy z
Consequently g ( y ) is a quadratic polynomial in y.
Answer: Any quadratic polynomial in x and y.
EXAMPLE 2.3
d2z
Find all functions z = f ( x , y ) that satisfy = 0.
Solut ion: Write the equation
( ) = 0,
dx \ d x /
then integrate:
dz
dx
= g( y ) -
Integrate again:
2 = g ( y ) x + h( y ) .
Answer: z { x, y) = g ( y ) x + h( y ) ,
where g i y) and h ( y) are arbitrary
functions of y.
EXAMPLE 2.4
Find all functions z = f ( x , y ) that satisfy
d2z
dx dy
2. Hi gher Parti al s 331
Solut ion: Write the equation
then integrate:
Integrate again:
= 0,
dy W
dz e ^
= P\ x ) .
dx
Z = a M + h( y ) ,
where g( x) is an antiderivative of p ( x ) . Note that g (x) is an arbitrary func-
tion of x since p ( x ) is.
Answer: z( x, y) = g( x) + h{ y) ,
where g(x) and h(y) are arbitrary
differentiable functions of one
variable.
Ch e c k :
d2
dx dy
[0(3) + h( y ) 2 = + h ( y ) ] j = ;[>'(j /)] = 0.
EXAMPLE 2.5
Find all functions z = f ( x , y ) that satisfy the system of partial
differential equations = y, = 1.
dx dy
Solut ion: Integrate the first equation:
z = xy + g( y) .
Substitute this into the second equation:
Ij cy + g ( y ) l = 1, x + g' ( y) = 1, g' ( y) = 1 - x.
dy
This is impossible since the left-hand side is a function of y alone.
Ans wer : No solution.
This example illustrates an important point. A system of partial differential
equations may have no solution at all! Could we have forseen this catastrophe
for the system above? Yes; for suppose there were a function/ (x, y ) satisfying
332 9. HIGHER PARTIAL DERIVATIVES
Then
j E L , J L _ ( 1) . .
d?/ d# d?/ ^ dy dx
so the mixed partials would be unequal, a contradiction.
If the system of equations = p( x, y) , = q(x, y )
dx dy
has a solution, then = .
dy dx
Indeed,
dp _ d_ / d z \ _ d_ / d z \ _ dq
dy dy \ d x j dx \ d y / dx
More Variabl es
All that has been said applies to functions of three or more variables.
For example, suppose w f ( x, y, z) . Then w has three first partials:
dw dw dw
dx dy 1 dz
The nine possible second partials may be written in matrix form:
d2W d2W d2W
dx2 dx dy dx dz
d2W d2W d2W
dy dx dy 2
dy dz
d2W d2W d2w
dz dx dz dy dz2
This matrix is symmetric since the mixed second partials are equal in pairs:
d2W d2W d2W d2W d2W d2W
dy dx dx dy dx dz dz dx 1 dz dy dy dz
EXERCISES
3. Tayl or Pol ynomi al s 333
Ldx dy2
, d*f j d3/
Compute t i t - and
dar d y
1. x*y*
3. x2yA
5. cos (xy)
7. e^sinz
9. xlly
11.
2. xAyb
4. xmyn
6. sin (x2y)
8. xy
x2+ ;
10.
12.
13. Find all functions f(x, y) such that
ay
dx2dy
x y
x + y
xy
x2+ y2
= 0.
d*f d*f
14. Find all functions f ( x , y ) such that _ 9 _ = 0 and --r-z = 0.
dx2, dy dx dy2
15. Find all functions f(x, y) whose 4-th partial derivatives all equal 0.
a4/
16. Find all functions / (a;, y) such that = 0.
dx2dy2
Find all functions f(x, y) that satisfy the system of partial differential equations:
17 dl = n dl = h
dx dy
1Q df 2 df
19. = y2, = X2
dx dy 20- s - v ' I - 3*9-
22. xy + yz + zx
24. x2+ yz.
Write the matrix of 9 second partials:
21. xmynzp
23. sin (a; + 2y + Sz)
25. How many distinct third partials does/(a;, y, z) have?
26. (cont.) Find an explicit function for which they really are distinct.
d3/
27. Find all functions / (x, y, z) satisfying ^ ^ ^ = 0.
28. How many distinct second partials does a function / (x, y, z , w) of 4 variables
have? How many distinct third partials?
3. TAYLOR POLYNOMIALS
Let us recall some facts about Taylor polynomials. (See Chapter 2,
Section 3.) If y = f ( x ) , then
f ( x ) = / (a) + f ' ( a ) (x - a) + n(a;),
334 9. HIGHER PARTIAL DERIVATIVES
and
f ( x ) = f ( a ) + f ' ( a ) ( x - a) + ^f " ( a ) ( x - a ) 2 + r2(x),
where
kiO*OI < (x - a )2, |r2(a:)| < j * \x - a|3,
and where M 2 and M s are bounds for |f ' { x )| and |f " ( x )| respectively.
The Taylor polynomial
Pi0*0 = f i f l ) + f ( a ) ( x - a) -
is constructed so that pi ( a) = /(a) and pi (a) =/'(a). The Taylor poly
nomial
P2 (x) = /(a) + f ' ( a ) ( x - a) + ^ f ( a ) ( x - a)2
is constructed so that p 2(a) = /(a), p 2 (a) = / ' (a), p 2" (a) = f " (a).
I n a similar way, one can construct linear and quadratic polynomials in
two variables approximating a given function of two variables.
Tayl or pol ynomi al s Let f ( x, y ) have continuous first and second partials
on an open domain D. The first degree and second degree Taylor
polynomials of / at (a, b ) are
Pifc y ) = /(>&) +/* 0 ) +/ y*(:V ~ 6),
P2^, y) = PiO, 2/) + hl fxx- (x - a)2 + 2f xy- (x - a ) ( y - b)
+ f yy9 (y &)2],
where all the partials are evaluated at (a, b).
I t is easy to check that p i ( a , b) = / (a, b) and that the first partials of pi
agree with those of / at (a, b). Similarly, p 2(a, b) = f (a, b) and all first and
second partials of p 2 agree with the corresponding partials of / at (a, b).
Now we ask how closely these Taylor polynomials approximate f ( x , y )
for (x, y) near (a, b). I n other words, we want estimates for the errors in the
approximations f ( x , y ) Pi ( x, y ) and f { x, y ) P2( x} y) . We shall obtain
error estimates subject to the mild condition that / is defined on a convex
domain. A set S in the plane or space is called CQXivex if it contains the whole
segment joining any two of its points.
3. Tayl or Pol ynomi al s 335
Theorem L et/be defined on a convex open domain D in R2and let a D.
(1) Suppose / has continuous first and second derivatives on D and the
second derivatives satisfy
I fxx\ < M 2, |fxy\ < M2, I fyy\ < M2, for all x D.
Let pi be the first degree Taylor polynomial of / at a. Then
f ( x ) = pi (x) + ri (x),
where | ri(x)| < M2 |x a| 2.
(2) Suppose / has continuous first, second, and third derivatives on D
and the third derivatives satisfy
|fxxx\ < Mz, I f X X y \ < Mz, I fxyy\ < Mz , |/y*/j/| < ^3-
Let p2 be the second degree Taylor polynomial of / at a. Then
f i x ) = p2(x) + r2(x),
a/2
where |r2(x)| < - M z |x a| 3.
o
Pr oof : The idea is to interpret the problem in such a way that we can
use the error estimates on the previous page for a function of one variable.
Assume at first that a = 0; this will simplify the notation considerably.
Now fix a point x D. By convexity, D contains the entire line segment con
necting 0 and x, that is, D contains all points Zx for 0 < t < 1. (Actually
because it is open, D contains a slightly larger open segment.) %
Set g( t ) = f ( t x ) . Then g( t ) is a function of one variable defined for 0 <
t < 1 and
9 (0) = / ( 0), 0(1) = / 00-
Let us compute the first and second degree Taylor polynomials of g at t = 0.
By the Chain Rule,
g' ( 0 = f x( t *) x + f y ( t x ) y ,
g' ( 0) = f x ( 0 ) x + f y ( 0) y,
g " ( t ) = f x x ( t *) x 2 + 2f Xy ( t x ) x y + f y y ( t x ) y 2,
g " ( 0) = f x x ( 0) x 2 + 2f Xy ( 0 ) x y +/(0)i/2,
^"'(0 = f xxx ( t x) x3+ 3f XXy ( t x ) x 2y + 3f Xy y ( t x ) x y 2 +f y y y ( t x ) y 3.
From these calculations we see that
Pl(x) = ff(0) +ff'(0), p,(x) = !7(0) +p'(0) + ^ " ( 0 ) .
Thus p-i(x) and p(x) are the first and second degree Taylor polynomials of
g( t ) evaluated at t = 1. Therefore
max \g (t)\
336 9. HIGHER PARTIAL DERIVATIVES
K ( x ) | = | / ( x ) - Pl (x)| <
|r2(*)l = l / ( * ) - Pa(*)I <
21
max \g"' (<)|
3!
I t remains to estimate \ g" ( t ) \ and \ g" ' ( t ) \ . We have
W (01<Mi |*|2+ 2Mi |*| \y\ + M2\y\2=M2(\x\ + \y\)2,
\g"'(t )\ < M 3 |*|3 + 3Ms |*|2 \y\ + 3M 3 \x\ \ y\2 + M 3 |*/|3 = M3(\x\ + |t/|)3.
Now we modify these estimates slightly as follows. From (|*| \ y\ )2 > 0
we have 2 |*| jz/| < |*|2 + |?/|2, hence
(1*1 + \ y\ ) 2 = 1*1* + 2 1*1\v\ + \y\2 < 2(l*l* + \y\2) = 2IXI2.
Take the power:
(1*1 + \ y\ ) 3 < 23/2|x|3.
Therefore,
\g" (01 < 2M 2|x|2, \ g" ' ( t ) \ < 2 y / 2 M 3 |x|3.
Combining results we obtain
, , ^ 2M 2|x|2 i , ^ 2y / 2M3 |x|3
l *i (x)l < 2, , |r2( x ) | < --------.
This completes the proof assuming a = 0. I n the general case, we define
g{ t ) = /[a + (x a)]. The proof proceeds as before, except that (#, y )
is replaced by (x a, y b) and the partials of / are all evaluated at a.
R e m a r k : There are Taylor polynomials of higher degree and corre
sponding error estimates. The notation for these polynomials is complicated,
and since we shall not need them, we leave their study to an advanced calculus
course.
EXAMPLE 3.1
Compute the Taylor polynomials pi ( x, y) and P2 &, y ) of the func
ti on/^, y) = \ A 2 + V2 (3>4).
Solut ion:
df ______ x df ______ y
dx y / x 2 + y2 dy \ / x 2 + y2
d2f y2 d2f xy d2f ______x2
dx2 (X2 + y 2y 12 dx dy (X2 + y2)3/2 dy2 (;X2 + y 2)*12
3. Tayl or Pol ynomi al s 337
At (3, 4),
ctf _ 3 f _ 4 ay _ _ie_ j y _ = _ } 2 _ a2/ _ 9
ax _ 5 a ^ s dx2 ~ 125 ax ay - 125 a?/2 125'
Therefore
p2(*>2/) = 5 + ^(a: 3) + - ( y 4)
1T 1fi 24 9
+ 2[ l 25<I - 3)' - i r 5 <* - 3)( - 4) + l 25(!' - 4)!J '
The polynomial pi (x, y ) is just the linear (first degree) part of p2 (a:, y) .
Answer:
Pi (x, y ) = l l 25 + 3(x - 3) + 4(y - 4)];
5
Pafo 2/) = Pi (x, y ) + ~ [16 (a; - 3)2
24(a; 3) (2/ 4) + 9 (2/ 4)2].
EXAMPLE 3.2
Estimate V(3- l )2+ (4.02)2 by
(a) pi ( %, y ) , (b) p2(x, y) ,
the Taylor polynomials in the last example.
Solut ion: Let f ( x , y ) = \ / x 2 + y 2.
(a) Near (3, 4),
/(*, y ) I [25 + 3(* - 3) + 4(2/ - 4)],
y(3.1, 4.02) - [25 + 3(0.1) + 4(0.02)] = 5.076.
5
(b)
f ( x, y ) pi ( x, y ) + ~ [16(a; - 3)2 24(a: - 3) ( y - 4) + 9 ( y - 4)2],
/(3.1, 4.02) pi (3.1, 4.02) + [16 (0.1 )2 - 24(0.1) (0.02) + 9(0.02)2]
250
= 5.076 + = 5.0764624.
338 9. HIGHER PARTIAL DERIVATIVES
Answer: (a) Approximately 5.076;
(b) Approximately 5.0764624.
(Actual value to 7 places: 5.0764555.)
EXAMPLE 3.3
If \x\ <0.1 and \y\ < 0.1, prove that
\ex sin Or + y ) (x + y) \ < 0.05.
Solut ion: Let f ( x , y ) = ex si n( x + y) . Then as is easily checked,
x + y is simply pi (x, y) , the first degree Taylor polynomial of f ( x, y ) at (0, 0).
Thus we are asked to verify that |ri(x)| < 0.05 for points x = (x, y ) with
|#| < 0.1 and \y\ < 0.1. Such points satisfy |x|2= (0.1 )2+ (0.1 )2, so we may
restrict the domain o f f to the open disk |x|2< 0.02.
Our error estimate yields
|ri(x)| < M 2 \x\2 = (0.02)M2,
where M 2 is a bound for \fxx\i |fxy\) I f w \ To find a suitable value for M 2y
compute the second partials:
fxx = 2ex cos (x + y) , f xy = 6x[cos(a; + y ) - sin (a; + y)],
fyy = ~ e x sin (x + y) .
The elementary estimates |sin(a; + y)\ < 1and |cos(#+ y) \ < 1show that
|fxx I < 26X, |f xy\ < 2eX, | fyy | < 6*.
Now \x\ < 0.1, so
ex < 60*1< 1.11,
as is seen from a table. Therefore we take M2= 2(1.11) = 2.22, and it
follows that
|n(x)| < (0.02) (2.22) = 0.0444 < 0.05.
EXERCISES
Compute the Taylor polynomials p i (x, y ) and p2( x, y) :
1. x2y2; at (1, 1) 2. x4ys; at (2,-1)
3. sin (xy); at (0,0) 4. exy; at (0,0)
5. xy\ at (1,0) 6. xy; at (1,1)
7. cos{x + y); at (0, t / 2) 8. 1+ xy; at (1, 1)
9. In Or+ 2y); at (^,i ) 10. x2ey; at (1,0).
Estimate using the second degree Taylor polynomial. Carry your work to 5 significant
figures:
11. (1.1 J1*2 12. [ (1.2 )2+ 7.2]1/3
4. Maxi ma and Mi ni ma 339
13. /(1.01, 2.01), where f(x, y) = x*y2 2xy4+ yb
14. V (1.99)2+ (3.01)2+ (6.01 )2.
15. Let pi(x, y) be the first degree Taylor polynomial of f ( x , y ) at (a, 6). Identify
the graph of z = pi ( x, y) .
16. Let pi ( x, y , z ) be the first degree Taylor polynomial of f ( x , y , z ) at (a, b, c).
Assume grad / 9^ 0 at (a, b, c). Identify the graph of the equation pi(x, y, z) =
/(a, b, c).
Prove the inequality, given \x\ < 0.1 and \y\ < 0.1:
17. |\/1 + x + 2y (1+ \ x + 2/)I < 0.04
18. \ex sin (a; + y) (1+ z)(l + y)\ < 0.01
(even <0.005 with more careful estimates).
4. MAXIMA AND MINIMA
I n this section we shall develop second derivative tests for maxima and
minima.
We shall assume here and in the rest of this chapter that all functions have
continuous first, second, and third partial derivatives. With this assumption,
the Taylor approximations of the previous section apply.
Let us begin with a brief review of the one-variable case. We consider a
function g( t ) and a point c where g' (c) = 0. Suppose g" (c) > 0. We want to
conclude that g( c) is a relative minimum of g. For this purpose, an excellent
tool is the second degree Taylor approximation of g at c :
g(t ) = g(c) + ^g"{ c) { t - c) + r(t),
1^2(01 5=&V ~ cl3> k > 0.
I t follows that
gi t ) ~ g(c) = ^g"{c)(t - c)2+ r2(<) > ~g"(c) it - c)2- k\ t - c\3
= it - c)2|^5f" (c) - k \t - c|J .
Since g"( c) > 0, the quantity on the right is positive if 0 < \t c\ < ^"(c)/A;.
Thus there is a positive number 8 = \ g " ( c ) / k such that g{ t ) g( c) >0
when 0 < \t c\ < 8. I n other words, g( c) is smaller than any other value
of g in an interval of radius 8 and center c. Hence g( c) is a relative minimum
of g.
Hessian Mat r i x
Let us try to generalize these ideas to the two variable case. Given /(#, y ),
suppose (a, b) is an interior point of the domain of / where f x(a, b) =
340 9. HIGHER PARTIAL DERIVATIVES
fy(a, b) = 0. We would like a condition on the second derivatives of/, analogous
to g" (c) > 0, guaranteeing that/(a, b) is a relative minimum of /. Now the
condition g" (c) > 0 can be interpreted as meaning that
g' ' ( c) ( t - c ) 2
is a positive definite quadratic form in the one variable t c. I t is natural,
therefore, to examine the quadratic part of the second degree Taylor poly
nomial o i f ( x , y ) at (a, b). If it is positive definite, we suspect that/(a, b) is a
minimum of /.
The quadratic part in question is
^[/** (* - )2+ 2f*y' (x - a ) ( x - b ) + f yv(2/ - b)2].
I n matrix-vector notation, this can be expressed as
- (x - a ) Hf (x - a)' ,
where
H f =
fxx fxy
Jyx fyy.
The matrix H f is called the Hessi an matrix of/. Our assumption on the partial
derivatives of / ensure that f xy = f yx. Hence H f is a symmetric matrix.
N o t e : I t is more accurate to write H f ( x , y ) since the Hessian matrix
generally depends on the point (x, y) . When the meaning is clear from the
context, however, we shall use the simpler notation H f .
Second Deri vat i ve Test
Theorem 4.1 Let a = (a, 6) be an interior point of the domain of /. Sup-
pose that
f x { a , b ) = 0, f y { a , b ) = 0,
and that the Hessian matrix H/ ( a , b ) is positive definite.
Then / (a, b ) is a relative minimum of / . That is, there is 8 > 0 such that
/ ( * , V ) > f ( a , b )
whenever
0 < |x a < S.
Proof: To make the notation simple, let us take (a, b) = (0,0).
Then the second degree Taylor approximation of / at (0, 0) is
f f ( %, y) = / ( 0, 0) + \ \ _Ax2 + 2Bxy + Cy21 + r( x, y) ,
4. Maxi ma and Mi ni ma 341
where
H f { 0,0) =
' A B
f xx
f xy
B C_
J vx f yy_
(0 ,0)
andh is a positive constant. Sinee H f is positive definite, we can apply Theorem
5.4, p. 249; there is another positive constant k such that
Ax 2 + 2Bxy + Cy2 > k |x|2.
Consequently
/(*, V) /(0, 0) > \ k |x|2- h |x|3= |x|2(J * - h |x|).
This implies f ( x , y ) f ( 0,0) > 0, that is f ( x } y ) > f ( 0,0), provided
0 < |x| < \ k / h. Finally, for 8 we may choose any number such that first
0 < 8 < \ k / h and second 8 is so small that the disk |x| < 8 is contained in the
domain D.
R e m a r k : The essential point in the proof is that for |x| small, the positive
definite quadratic form Ax 2 + 2Bxy + Cy2 is much larger than the remainder
r( x, y ) because r is of third order: |r( x, y) \ < h |x|3.
By Theorem 5.1, p. 247, the matrix H f (a, b) is positive definite if and
only if
f x x f x y
that is,
f x x ( a , b ) > 0,
f y x f y y
f x x iP', b ) > 0, ( f x x f y y / Xy)
>0,
(a,b)
> 0.
(a ,6)
Suppose we want a maximum instead of a minimum. We reason that
finding a maximum of / is the same as finding a minimum of /. Therefore
the following conditions are sufficient for/(a, b) to be a relative maximum of/
at a point of its (open) domain:
fx(a, b) = 0, f v (a, b) = 0,
fxx (a, b) <0, (fxx fyy ~ f l y )
> 0.
(a ,6)
The two inequalities make up a necessary and sufficient condition for the
Hessian matrix of /,
- f x x - f x y
H - f =
_ f yx f yy_
to be positive definite, namely, f Xx > 0 and det H - f > 0. Note particularly
that det H - S = det H f , not the negative.
(
342 9. HIGHER PARTIAL DERIVATIVES
Let us review our procedures. We start with / on an open domain D.
We find a point (a, b ) where /* = 0 and f y = 0. We compute the Hessian
matrix
H f =
f x x fxx
. f y x f yy _
at (a, b). There are three possibilities for det H f = f x x f yy f \
(1) det H f > 0,
In case (1), we have
(2) det Hf = 0, (3) det H f < 0.
so fxx 7* 0. Either f xx > 0 and f ( a , b ) is a relative minimum, or f xx < 0 and
/(a, b) is a relative maximum.
I n case (2), no conclusion can be drawn. For example take the functions
f ( x, y ) = x3y s and g{ x} y ) = xAy*. Then f x = f y = 0 and H f = 0 at (0, 0),
and similarly for g. Y et g has a minimum at (0, 0) whereas / has neither a
minimum nor a maximum.
I n case (3) there is useful information and we state it as a formal theorem.
Theorem 4.2 Suppose f x( a, b) =0 and f y ( a, b) =0 at an interior point
of the domain of /. Suppose
f x x f x y
f y x f y y
< 0.
(a ,6)
Then f ( a , b ) is neither a relative maximum nor a relative minimum of /.
That is, there are points (x, y ) arbitrarily close to (a, b) where /(#, y ) <
f (a, b) and there are points (x, y ) arbitrarily close to (a, b) where/(x, y ) >
/(a, b).
Proof: We take (a, b) = (0, 0) for simplicity and consider the Taylor
approximation
/(>y ) = /(0, 0) + ^( Ax2 + 2Bxy + Cy2) + r( x, y ), \r(x, y) \ < h |x|3.
Since
A B
f x x fxy
B C fyx
fyy (0,0)
the quadratic form Q( x, y ) = Ax 2 + 2Bxy + Cy2 is indefinite by Theorem
5.7, p. 251. Therefore there are points (x\, yi ) and (x2, y2) such that
Q( xi, y\ ) < 0 and Q( x2, y 2) > 0.
4. Maxi ma and Mi ni ma 343
(b) level curves
Fig. 4.1 graph of z = y2 x2
344 9. HIGHER PARTIAL DERIVATIVES
Consider (x, y ) = (tei, t yi ), where t > 0 and small. Then
-Q(tei , fr/i) = ^Q(#i, 2/i)^2= kt2, k < 0 ,
|r(tei, t yi)\ < h( x\ + y \ ) l,2t3= W8,
hence
/(tei, t yi ) - /(0, 0) < kt2 + hit* = t2(k + W).
Now for t sufficiently small, t2(k + hit) < 0, hence
/(tei, t yi ) < /(0,0)
for points (tei, t yi ) as close to (0, 0) as we please.
Similarly/(te2, t y2) > /(0, 0) for points (te2, t y2) as close to (0, 0) as we
please. Therefore /(0,0) is neither a max nor a min of /.
I n the case we have just discussed, / is said to have a saddle point at
(a, b ); the surface z = f ( x, y ) is shaped like a saddle near (a, b, f (a, b) ) . The
tangent plane is horizontal; the surface rises in some directions, falls in others,
so that it crosses its tangent plane. This is like the summit of a mountain pass.
See Fig. 4.1 for an example.
Summary Let (a, b) be an interior point of the domain of / and suppose
f x ( a , b) = f v ( a, b) = 0.
(1) If f xx > 0 and f xxf yy f l y > 0, then / (a, b) is a relative min.
(2) If f xx < 0 and f x x f yy f xy > 0, then/(a, b) is a relative max.
(3) If f xxfyy f l y < 0, then f ( a , b ) is a saddle point, neither a relative
max nor min.
(4) I f fxxfyy f l y = 0, no conclusion can be drawn from this information
alone.
Does the function z = f ( x , y ) have
Try the first and second derivative
1. z = x2
3. z = x2- 4y2
5. z = x2 + 2xy y2
7. z = x4 + y4
9. z = x3 + y3
EXERCISES
a possible maximum or minimum at the origin?
tests. Sketch a few level curves near the origin:
2. z = xy
4. z = x2+ 2xy + y2
6. z = x4 + y2
8. z = x2y2
10. z = x2+ yb.
11. Suppose f ( x, y ) is a function of two variables and x = x( t ) and y = y( t ) are
functions of time. Form the composite function g(t ) = / [ z (t), y 0)]. By the Chain
5. Appl i cati ons 345
Rule, g(t ) = f x[x(t ), y ( t )](*) + f v[x(t ), y(t)~\y(t). Show that
g = fxxX2 + 2fxyXy + fyyi/2 + fxX + fyjj.
12. (cont.) Suppose also that f x(0, 0) = f y (0,0) = 0, and that only curves x(t) =
( x( t ) , y ( t ) ) are allowed which pass through (0, 0) with speed 1at t = 0. Suppose
for each such curve g(0) > 0. Show that g(0) = 0 and fxx(0, 0) > 0 and
/w(0 0) > 0. /
13. (cont.) Show also that/2x(0, 0)/^(0, 0) f xy(0, 0)2> 0.
14. (cont.) Find a formula for cPg/dt*.
15. (cont.) Suppose g( x, y ) = Ax2 + 2Bxy + Cy2 and AC B2 = 0. Conclude that
g(x, y) = -b {ax + by)2.
5. APPLICATIONS
EXAMPLE 5.1
Show that/(#, y ) =4x2 xy + y2 has a minimum at (0, 0 ) .
Solut ion:
f x = 8x - y , f y = - x + 2 y .
Both partials are 0 at (0, 0 ) . Furthermore,
f x x = 8, f x y = 1, f y y = 2.
Hence,
f x x > 0, f x x f y y - f xy2 = 15 > 0.
These conditions ensure a minimum at the origin.
EXAMPLE 5.2
Of all triangles of fixed perimeter, find the one with maximum area.
Solut ion: I f the sides are a, b, c, then the area A is found by Herons
formula:
A 2 = s ( s a) (s b) (s c),
where s = + b + c) is the semiperimeter. Thus
2s = a + b c, s c = a + b s,
A 2 = s ( s a) (s b) (a + b s) .
Here s is constant and the variables are a and b. I n order to maximize A , it
suffices to maximize
/(a, b) = (s a) (s b) (a + b s) .
Now
f a = (s b) (2s 2a b), f h = (s a) (2s a 26),
f a a 2( s b), f ab 3s + 2a + 26, f bb 2( s a) .
346 9. HIGHER PARTIAL DERIVATIVES
The equations f a = 0 and f b = 0 imply
2a -j- b = 2s, a 26 = 2s,
since s = a or s = b is impossible in a triangle. (Why?) I t follows that
For these values of a and 6,
/ a o f bb 3 < f ab ~ ^ ,
4S2 g2 g2
f a a f b b ~ f ab2= ~ = ~ > 0.
These conditions ensure a maximum. Now compute c :
2 2 2
c = 2s a b = 2s ~ s - s = - s .
O o O
Hence a = b = c.
Answer: The equilateral triangle.
EXAMPLE 5.3
x2 y2
Find the point on the paraboloid z = ~
closest to i = (1, 0, 0).
Solut ion: Let x = (#, y, z ) be a point on the surface. Then
|x i|2= (x l )2+ y2 + 22
= (x - l )2+ y2 + ^ + 0 = f ( x , y) .
The function/ (x, y) must be minimized. Now
/ - 2<* - >) + * ( f + f ) ,
The condition /j, = 0 is satisfied if either
y = 0 or 1+ ? ^ + * V o .
9 \4 9/
The latter is impossible. Therefore f y = 0 implies y = 0. But if y = 0, the
condition f x = 0 means
^3
2( x 1) + = 0, that is, x* + Sx 8 = 0.
A rough sketch shows this cubic has only one real root, and the root is near
x = 1. By Newtons Method iterated twice, x 0.9068.
Thus f ( x , y ) has a possible minimum only at the point (a, 0), where
a3+ 8a 8 = 0. Test the second derivatives at (a, 0):
/** = 2 + | a2> 0, f xy = 0, fyy = 2 + i a2> 0,
/e/w - /*y2> 0.
Therefore the minimum does occur at (a, 0).
5. Appl i cati ons 347
Answer: ( a , 0, j a*) , where
a3+ 8a 8 = 0, so a 0.9068.
The following example shows that you must be careful when the variables
are restricted.
EXAMPLE 5.4
y 2 z2
Find the points on the ellipsoid x2 + + = 1nearest to and
y 4
farthest from the origin.
Solut ion: The square of the distance from (x, y, z ) to (0, 0, 0) is
f { x, y ) = x2 + y2 + z2 = x2 + y2 + 4 - x2 - = 4 - dx2 + ^y2.
Since
there is a natural restriction on x and y :
Thus the domain of/ (x, y ) is the closed set bounded by the ellipse x2 + \ y 2 = 1.
The first partials are
348 9. HIGHER PARTIAL DERIVATIVES
These vanish at (0, 0) only. However, at (0, 0),
f x x f y y f x y 2 = 6 0 < 0.
Therefore f ( x , y ) has neither a maximum nor a minimum at (0, 0). There
seems to be no possible maximum or minimum.
We have forgotten the boundary! The continuous function / has both a
maximum and a minimum in its bounded closed domain, and since they do not
occur inside the domain, they must occur on the boundary curve (Fig. 5.1).
Notice that if (x, y) is on the ellipse, then
V2
x2 + = 1 and 2 = 0.
Thus the nearest and farthest points from the origin are actually points on the
ellipse. By inspection (Fig. 5.1), they are the points (1, 0, 0) and (0, 3, 0).
[The points (0, 0, 2) are saddle points for f ( x, y ).]
Answer: Nearest: (1,0,0); farthest: (0, 3, 0).
EXERCISES
Find the point nearest the origin in the first octant and on the surface:
1. xyz =1 2. xy2z 1.
6. Three Vari abl es 349
3. Find the maximum and minimum of sin (xy) for 0 < x} y < t t .
4. Find the points on the hyperboloid x2 + y2/ 4 z2/ 9 = 1nearest the origin.
5. Given n points Pi, , Pn of the plane, find the point P such that "=i PP/ is
least.
6. For each point (x, y, z ) , let f(x, y, z) denote the sum of the squares of the distances
of (x, y, z) to the three coordinate axes. Find the maximum of f(x, y, z) on the
unit sphere x2 + y2 + z2 = 1.
7. Find the least distance between the line x(t ) = (+ 1, t, t 3) and the line
y( r ) = (2r +3, r + 1, r 2).
8. Give an alternate solution to Example 5.3. By inspection, find the point closest to
i on each cross-section z = constant.
9. Approximate to 3 places the minimum first octant distance from the origin to the
surface xeyz = 1.
6. THREE VARIABLES
The extension to three variables of the second derivative test for maxima
and minima is straightforward.
Theorem Let a be an interior point of the domain of / and suppose
grad/ = 0.
Let
H f ( a) =
f x x f x y f x z
f y x f y y f y z
_ / zx f z y f z z _
denote the Hessian matrix of / at a.
(1) If H f (a) is positive definite, then /(a) is a relative minimum of /,
that is, there is 8 > 0 such that /(x) >/ ( a) whenever 0 <
|x a| < 8.
(2) I f H f (a) is negative definite, then/(a) is a relative maximum of /,
that is, there is 8 > 0 such that /(x) </ ( a) whenever 0 <
|x a| < 8.
(3) I f Hf (a) is indefinite, then /(a) is neither a relative max nor min.
(4) Except in these cases, no conclusion can be deduced from Hf ( a)
alone.
The proof of (1) is essentially the same as the proof of Theorem 4.1. As we
know, (2) follows by applying (1) to /. The proof of (3) is essentially the
proof of Theorem 4.2.
350 9. HIGHER PARTIAL DERIVATIVES
We can use the practical tests in Chapter 7. For instance Theorem 5.2,
p. 248, tells us that H f is positive definite if and only if
f x x > 0,
f x x f x\
> 0, and \ Hf \ > 0.
f y x f y y
Let us take a very simple example. Suppose
f ( p , y , z ) = + by2 + cz2,
where a, b, c are nonzero. Then
grad/ = (2ax, 2by, 2cz),
so grad/(x) = 0only at x = 0. The matrix A in this case is
f x x f x y f x z
2a 0 0
A =
fyx
fyy fyi = 0 2b 0
_ f z x
fry
_0 0 2c_
According to the test, /(0) is a minimum if the three determinants are positive:
2a 0 0
2a > 0,
2 a 0
0 2b
>0, 0 2b 0
0 0 2c
>0,
that is, if
2a > 0, 4ab > 0, 8abc > 0.
This is so precisely if a > 0, b > 0, and c > 0.
Similarly/(0) is a maximum if
2a < 0, 4ab > 0, 8abc < 0,
which is so precisely if a < 0, b < 0, and c < 0.
These results agree with common sense:
/ ( x ) = ax2 + by2 + cz2
so obviously /(0) = 0. If a, b, c are all positive and x ^ 0, then / ( x ) > 0.
I f a, b, c are all negative and x^0, then/(x) < 0. (If a, b, c are not all of the
same sign, then/(0) is neither a maximum nor a minimum. Why?)
EXAMPLE 6.1
Find all maxima and minima of the function
/(x) = x2 + 3 y 2 + 4 z 2 2 xy 2 yz + 2zx.
6. Three Vari abl es 351
Solution:
grad/ = (2x 2y + 2z, 2x + 6y 2z, 2x 2y + 8z).
The vector equation grad/ = 0 amounts to the system of scalar equations
(divide by 2 )
x y + z = 0
- x + Sy z = 0
x y + 42 = 0,
whose only solution is x = 0.
The matrix A is
f x x f x y f x z
2 - 2 2"
f y x f yy f y*
= - 2 6 - 2
_ f z x f z y fzz__
2 - 2 8_
so the three relevant determinants are
2,
2 - 2
- 2 6
and
2 - 2 2
- 2 6 - 2
2 - 2 8
= 48.
All are positive. Hence by the test above, / ( 0) is a minimum of /, the only one;
there are no maxima.
Answer: The only extreme value of /
is the minimum/(0) = 0.
EXERCISES
The second derivative test is inconclusive at (0, 0, 0) for the given function. Deter
mine nonetheless if the function has a maximum, a minimum, or neither at the origin:
1. x2 + y2 + z4 2. x2 + y2z2
3. x2 + y2 4. x4+ y2z6
5. x2 + y4 + 26 6. x*y*
7. x4 + 2/323 8. x4y4 25.
9. x4 + y2z2 10. xs + y3 + 2*
11. x ^ z 3 12. x2y2z2.
Find the extreme values:
13. 2x2y2 322 + 2xy 2 xz
352 9. HIGHER PARTIAL DERIVATIVES
14. x2 + 2y2 + z2 + 2xy tyz
15. 2x2 + y2 + 2z2 + 2xy + 2yz + 2zx + x 3z
16. 2 + 3^2/ + y2z2x 2y + 2 + 3.
Determine if the function has a maximum, a minimum, or neither at the origin:
17. x2 + y2 + z2 + xy + yz + zx 18. x2 + 4y2 + 9z2xy 2yz
19. - x 2 - 2y2- z2+ yz 20. x2+ y2+ 2z2- 10yz
21. x2y2 + 3z2 + 12xy 22. Sx2 + y2 + 4z2xy yz zx.
23. A surface x = x(u, v) and a curve p = p(t) are given. Suppose the distance
\x(u, v) p(0 | of a point on the surface to a point of the curve is minimized (or
maximized) for x0 on the surface and poon the curve. Show that the segment from
x0 to pois normal to both the surface and the curve. Find the one exception to this
assertion.
24. (cont.) Formulate the corresponding statement for two surfaces. (This is a four-
variable problem!)
7- MAXIMA WITH CONSTRAINTS [optional]
Here are several problems that have a common feature.
(a) Of all rectangles with perimeter one, which has the shortest diagonal?
That is, minimize (;x2 + y2) 112 subject to 2x + 2y = 1.
(b) Of all right triangles with perimeter one, which has largest area?
That is, maximize xy/ 2 subject to x + y + (x2 + y2) 1'2 = 1.
(c) Find the largest value of x + 2y + 3z for points (x, y, z) on the unit
sphere x2 + y2 + z2 = 1.
(d) Of all rectangular boxes with fixed surface area, which has greatest
volume? That is, maximize xyz subject to xy + yz + zx = c.
Each of these problems asks for the maximum (or minimum) of a function
of several variables, where the variables must satisfy an equation (constraint).
7. Maxi ma wi th Constrai nts 353
For example, in (a) you are asked to minimize
f ( x, y) = x2 + y2,
where x and y must satisfy
g(x, y) = 2x + 2y - 1 = 0.
Such problems may be analyzed geometrically. Suppose you are asked
to maximize a function f ( x, y ) , subject to a constraint g(x, y) = 0. On the
same graph plot g(x, y) = 0 and several level curves of f ( x , y ) , noting the
direction of increase of the level (Fig. 7.1). To find the largest value of f ( x, y)
on the curve g( x, y ) =0, find the highest level curve that intersects g = 0.
If there is a highest one and the intersection does not take place at an end
point, this level curve and the graph g = 0 are tangent.
Suppose f ( x, y) = M is a level curve tangent to g(x, y) = 0 at a point
( x, y) . See Fig. 7.2. Since the two graphs are tangent at ( x, y) , their normals
at (x, y) are parallel. But the vectors
grad f i x, y) , grad g( x, y )
point in the respective normal directions (see p. 298), hence one is a multiple
of the other:
grad/ (x, y) = X grad g(x, y)
for some number X. (The argument presupposes that grad g 9^ 0 at the point
in question.)
This geometric argument yields a practical rule for locating points on
9 (x, y) =0 where f ( x , y ) may have a maximum or minimum. Note that
where the condition of tangency is satisfied, there may be a maximum, a
minimum, or neither (Fig. 7.3).
354 9. HIGHER PARTIAL DERIVATIVES
To maximize or minimize a function f ( x , y ) subject to a constraint
g(x, y) = 0, solve the system of equations
( f x j y ) = *( g, , gy), g( x, y ) = 0
in the three unknowns x, y , X. Each resulting point (x, y) is a candidate.
The number X is called a Lagrange multiplier, or simply multiplier.
To apply this rule, three simultaneous equations
fx(x, y) = \ g x(x, y)
* f v ( x > y ) = 2/)
, g(x, y) = o,
must be solved for three unknowns x, y, X.
EXAMPLE 7.1
Find the largest and smallest values of f ( x, y) = x + 2y on the
circle x2 + y2 1.
Solution: Draw a figure (Fig. 7.4). As seen from the figure, / takes its
maximum at a point in the first quadrant, and its minimum at a point in the
7. Maxi ma wi th Constrai nts 355
third quadrant. Here
f ( x, y) = x + 2y, g(x, y) = x2 + y2 - 1;
grad/ = (1, 2 ), grad#= (2x, 2y ).
The conditions
= Xfo,, &,), y) = 0
become
Thus
(1, 2 ) = \ ( 2 x , 2y) , x2 + y2 = 1.
1
By the third equation,
5 1 _
A* = - , X = - V 5-
The value X = i \ / 5 yields
1 2
X~VE y~Vs
the value X = \/5 yields
* = - V 5 y = ~ v t
f ( x, y) = = \/5;
356 9. HIGHER PARTIAL DERIVATIVES
Answer: Largest \ / 5 ; smallest y/ 5.
EXAMPLE 7.2
Find the largest and smallest values of xy on the
segment 2x + y = 2, x > 0, y > 0.
Solution: Draw a graph (Fig. 7.5). Evidently the smallest value of
xy is 0, taken at either end point.
To find the largest value, use the multiplier technique with
f i x, y) = xy, g(x, y) = 2x + y - 2.
The relevant system of equations is
(y , x) = A (2, 1)
*
2x + y - 2 = 0.
7. Maxi ma wi th Constrai nts 357
Thus x = X, y = 2X, and
2X + 2A - 2 = 0, X = ^.
Therefore
(*,2/) = Q 1) / ( i , 1 ) = 2
Answer; Largest - ; smallest 0.
EXAMPLE 7.3
Find the largest and smallest values of x2 + y2, where xA+ yA= 1.
Solution: Graph the curve x* + yA= 1 and the level curves of
f ( x>y) = x2 + y2. See Fig. 7.6. By drawing xA+ yA= 1 accurately, you see
that the graph is quite flat where it crosses the axes and most sharply curved
358 9. HIGHER PARTIAL DERIVATIVES
where it crosses the 45 lines y x . I t is closest to the origin (jx2 + y2 is
least) at (dbl, 0 ) and (0, 1), and farthest where y = x .
The analysis confirms this. Use the multiplier technique with
f ( x, y) = x2 + y2, g( x, y) = x* + y* - 1.
The relevant equations are
(2x, 2y) = X(4#3, 4y*)
#4 + yA= l *
Obvious solutions are
x = 0, y = dbl, * = V = 0, = 1, x = ^*
Thus the points (0, 1) and (1,0) are candidates for the maximum or
minimum. At each of these poi nts/(x, y) = 1.
Suppose both x ^ 0 and y ^ 0. From
2x = 4X#3, 2 y 4 X?/3
follows
#2 = y2
Hence X = 1/ (2x2) > 0. From xA+ y* = 1follows
2X
GO + (i ) = X2 i
x =
V 2
V 2
Consequently, the four points
( * ^ 2 ^ )
are candidates for the maximum or minimum. At each of these points f ( x, y ) =
x2 + y2 = 2/ V 2 = y / 2 .
Answer: Largest \ / 2 ; smallest 1.
EXAMPLE 7.4
Maximize 2y x on the curve y = sin x for 0 < x < 2ir.
Solution: Here f ( x , y ) = 2 y x and g( x, y ) y sin#. The equa
tions are
( 1, 2) = X( cos Xj 1)
y = sin x,
7. Maxi ma wi th Constrai nts 359
from which follow X = 2, cos x = x = 7r/ 3 or 57r/3. The maximum must
occur at
/ 7T a/ 3\ / V jA
\ 3 2 / \ 3 2 /
or at one of the end points (0, 0) and (2t , 0). But
./7T \ / 3 \ 777 7T \ / 3 \ AT 57T
fin, 0) - 0, /(2t, 0) = 2.
Of these numbers, \ / S 7r/ 3 is the largest, being the only positive one.
Answer: \ / 3 " .
3
R e m a r k : The preceding example is illustrated in Fig. 7.7. From the
figure, can you tell where the maximum occurs if x is restricted to the interval
tt/2 < X < 2tt?
Fi g . 7.7
EXERCISES
1. Find the maximum and minimum of x + y on the ellipse (z2/4) + (y2/ 9) = 1.
2. Find the extreme values of x y on the branch x > 0 of the hyperbola (x2/9)
(2/74) = 1.
3. Find the extreme values of x y on the branch x > 0 of the hyperbola (z2/4)
(y2/ 9) = 1. Explain.
4. Find the extreme values of x y on the branch x > 0 of the hyperbola x2 y2 = 1.
Explain.
5. Find the maximum and minimum of xy on the circle x2 + y2 = 1.
6. Find the maximum and minimum of xy on the ellipse (x2/4) + (y2/ 9 ) = 1.
7. Find the rectangle of perimeter 1 with shortest diagonal.
360 9. HIGHER PARTIAL DERIVATIVES
8. Find the right triangle of perimeter 1 with greatest area.
[Hint: Eliminate X.]
9. Find the right circular cone of fixed lateral area with maximum volume.
10. Find the right circular cone of fixed total surface area with maximum volume.
11. Find the right circular cylinder of fixed lateral area with maximum volume.
12. Find the right circular cylinder of fixed total surface area with maximum volume.
13. Let 0 < p < q. Find the maximum and minimum of xv + yv on xq + yq = 1,
where x > 0 and y > 0.
14. (cont.) Let 0 < p < q and x > 0, y > 0. Show that
^xp + ypJ ,p < f xq + yqyJ lq
15. Find the maximum and minimuni of x2y on the short arc of circle x2 + y2 = 1
between (^a/3, i ) and
8- FURTHER CONSTRAINT PROBLEMS [optional]
I n space, the problem is to maximize (minimize) a function f ( x , y , z )
subject to a constraint g( xy y, z) = 0. One seeks level surfaces of f ( x, y, z)
tangent to the surface g( x9y , z ) = 0. See Fig. 8.1. Each point of tangency
is a candidate for a maximum or minimum. At a point of tangency, the
normals to the two surfaces are parallel (Fig. 8.2 ). But the two vectors
grad/(x), grad <7 (x)
point in the respective normal directions, so the first must be a multiple of the
8. Further Const rai nt Probl ems 361
9 =0 f = M
common direction
of normals
F i g . 8.2
second, provided grad g(x) 0. This observation leads to a practical method
for locating possible maxima and minima (proof omitted).
To maximize (minimize) a function f ( x , y , z ) subject to a constraint
y> z) = 0, solve the system of equations
( f xyf vj f z) = X(gx, gv, gz), ^ ( x ) = 0
in the four unknowns x, y , z , X. Each resulting point (x, y, z) is a candidate.
I n applications, the usual precautions concerning the boundary must be ob
served.
EXAMPLE 8.1
Find the longest and shortest distance from the origin to the
ellipsoid x2 + + ~ = 1.
9 4
Solution: Set /(#, y, z) equal to the square of the distance,
f ( x y y, z) = x2 + y2 + z2,
and set
Then
grad/ = (2x, 2y, 2z)
362 9. HIGHER PARTIAL DERIVATIVES
The required system of equations is
(2x, 2y, 2z) = X
If x 9^ 0, then by the first equation X = 1; consequently y = z = 0, so by
the second equation x2 = 1, x = 1. Thus two candidates are the points
( 1, 0, 0). Similarly, there are four other candidates, namely
R e m a r k : Compare this procedure with the previous solution of the
same problem in Section 5, p. 347. The advantage of the present method will
be crystal clear.
EXAMPLE 8.2
Find the volume of the largest rectangular solid with sides parallel
to the coordinate axes that can be inscribed in the ellipsoid
Solution: One-eighth of the volume is
f ( x , y , z ) = xyz,
where a; > 0, y > 0, z > 0. See Fig. 8.3. The constraint is g(x, y, z) = 0,
where
X = 9: (0, 3,0); X = 4: (0, 0, 2).
Since
/(1, 0, 0) = 1, /(0, 3, 0) = 9, /(0, 0, 2) = 4,
the maximum distance is \ / 9 and the minimum distance is 1.
Answer: Longest 3; shortest 1.
v2 z2
g(x, y, z) = x2 + - + - - 1.
Set grad/ = X grad g and g = 0:
8. Further Const rai nt Probl ems
that is,
yz = 2\ x
2 y2 z2
xy = -Xz
Multiply the first two equations and cancel x y :
Likewise
a:2 = - X2, y2 = X2.
Substitute in the fourth equation:
1 1 1
- x2 + - X2 + - X2 = 1, X2 = 3,
9 9 9
3
y2 = 3, z2 =
Hence
/(*, y, z)2 = a;2J/2z2 = ^, / m =
364 9. HIGHER PARTIAL DERIVATIVES
Answer: 8
2 y / 3 16
vs.
Two Constraints
Fi g . 8.4. two constraints
Suppose the problem is to maximize (minimize) f ( x, y, z), where (x, y , z)
is subject to two constraints, g ( x , y , z ) =0 and h( x , y , z ) = 0. Each con
straint defines a surface, and these two surfaces in general have a curve of
intersection (Fig. 8.4). A candidate for a maximum or minimum of /(x) is a
point x where a level surface of / is tangent to this curve of intersection (Fig.
8.5). The vector grad/(x) is normal to the level surface at x, hence normal to
8. Further Const rai nt Probl ems 365
the curve. But the vectors grad <7(x) and grad h( x) determine the normal
plane to the curve at x. Hence for some constants X and ju,
R e m a r k : The existence of such multipliers X and ju presupposes that
grad g 0, grad h ^ 0, and that neither is a multiple of the other.
To maximize (minimize) a function f ( x, y, z) subject to two constraints
g(x, y, z) = 0 and h(x, y, z) = 0, solve the system of five equations
( fx) fyy f z ) = X (gX) gy, gz ) H(hx, ky, kz)
g( x) = 0, h( x) = 0
in five unknowns x, y, z, X, ju. Each resulting point (x, y, z) is a candidate.
EXAMPLE 8.3
Find the maximum and minimum of f ( x, y, z) = x + 2y + z on
the ellipse x2 + y2 = 1, y + z = 1.
grad/(x) = X grad g( x) + jugrad h( x) .
Fi g . 8.6
Solution: Here
g(x, y, z) = x2 + y2 - 1, h(x, y, z) = y + z - 1;
Fig. 8.6 shows the curve of intersection of the surfaces g(x) =0 and h(x) = 0,
366 9. HIGHER PARTIAL DERIVATIVES
a plane section of a right circular cylinder. The equations to be solved are
1(1, 2, 1) = \ ( 2 x , 2 y , 0 ) + (0, 1, 1)
X2 + j /2 = l , y + Z = 1 .
1 1
Hence
1 = 2\x, 2 = 2 \ y + n, 1 = n; x =
2X * 2X
The solution X = \/2/2 leads to the point
( y / 2 y / 2 _ y / 2 \
1 \ 2 2 2 j
and the solution X = y / 2 / 2 leads to the point
These are the only candidates for maxima or minima. But
/(xi ) = 1 + y / 2 , /(X 2) = 1 - y / 2 .
Answer : Maximum 1+ \ / 2 ; minimum 1 V 2-
Quadrati c Forms
I n Chapter 7, we postponed part of the proof of Theorem 8.5 that a 3 X 3
symmetric matrix has three real characteristic roots. We fill this gap now
using Lagrange multipliers.
Let A be a 3 X 3 symmetric matrix. We consider the problem of finding
the minimum of / (x) = x^4x' on the surface of the unit sphere. We set g (x) =
xx' 1, so the problem is to minimize/(x) subject to g( x) = 0.
Since the (surface of the) unit sphere is a closed bounded set, the contin
uous function /(x) has a minimum value at some point Xo. By Lagrange
multiplier theory, g(*o) = 0, that is, XoXo= 1, and
grad/(x0) = X grad g (xo),
for some X. A direct computation shows that
grad/(x) = 2Ax' and grad <7(x) = 2x',
hence the condition is
Axo = Xxo.
Thus Xois a characteristic vector of A and X is the corresponding characteristic
root. Furthermore,
f ( x0) = XoAx{) = Xx0Xq= X,
8. Further Constrai nt Probl ems 367
so X actually is the minimum value. I t follows that X is the smallest real
characteristic root of A.
Similarly, /(x) has a maximum value v on the unit sphere; this value is the
largest real characteristic root of A.
I t may be that X = v. I f so, then /(x) has the constant value X on the
sphere. Hence/(x) = Xxx' and it follows that A = X/. Thus the characteristic
roots of A are X, X, X, all real.
I f X < v, then A has two distinct real characteristic roots. I t follows that
the third characteristic root must also be real. For the characteristic poly
nomial of A is a cubic, and if a cubic has two real zeros, the third zero is also
real, since complex zeros of polynomials with real coefficients come in con
jugate pairs.
A Second Derivative Test
Suppose we wish to minimize f ( x , y , z ) subject to the constraint
d(x, y> z ) = 0- Suppose we have found a solution (#, y, 2, X) to the system of
equations
grad/ = X grad g
0 = 0.
I t would be nice to have a test for a minimum analogous to the second deriva
tive test in Section 6. There is such a test, but its proof is beyond the scope of
this course.
Theorem Let x0, X be a solution of
grad/ = grad g
where Xois interior to the common domain of / and g, and (grad g) (xo) ^ 0.
If the matrix
( Hf - \ H 0)
is positive definite, then there is a 8 > 0 such that / ( x ) > / ( x 0) for all x
satisfying g( x) = 0 and 0 < |x x0| < 8.
EXERCISES
1. Assume a , b , c > 0. Find the volume of the largest rectangular solid (with sides
parallel to the coordinate planes) inscribed in the ellipsoid
_j_ t . - l t = 1
a2 ft2 c2
368 9. HIGHER PARTIAL DERIVATIVES
2. Find the triangle of largest area with fixed perimeter.
[Hint: Use Herons formula A2 = s(s x) (s y ) ( s z), where x, y, z are the
sides and s is the semiperimeter.]
3. Find the maximum and minimum of the function x + 2y + 3z on the sphere
x2 + y2 + z2 = 1.
4. Find the rectangular solid of fixed surface area with maximum volume.
5. Find the rectangular solid of fixed total edge length with maximum surface area.
6. Find the rectangular solid of fixed total edge length with maximum volume.
7. Maximize xyz onx + y z = 1.
8. (cont.) Conclude that y/xyz < x for x > 0, y > 0, and z > 0.
o
9. Let 0 < p < q. Find the maximum and minimum of xp + yp + zp on the surface
xq + yq + zq = 1, where x > 0, y > 0, and z > 0.
10. (cont.) Let 0 < p < q and x > 0, y > 0, z > 0. Show that
/ xp + yp + zp\ 1/p ^ / x q+ yq+ zq
\ 3 ) ~ \ 3
11. Show that xy + yz + zx < x2 + y2 + z2 for x > 0, y > 0, and 2 > 0.
12. Find the minimum of x2 + y2 + z2 on the line x + y + z = 1, x + 2y + 3z = 1.
13. Find the maximum and minimum volumes of a rectangular solid whose total edge
length is 24 ft and whose surface area is 22 ft2.
14. Find the maximum and minimum of x + y + z on the first octant portion of the
curve xyz 1, x2 + y2 + z2 = .
[Hint: Show that x, y, z are roots of the same quadratic equation; conclude that
two of them are equal.]
15* Let A and B be 3 X 3 matrices with B positive definite. Show that each Lagrange
multiplier X in the problem of maximizing xAx' on the ellipsoid xBx' = 1 is a
characteristic root of AB~~l. Assume A and B are symmetric.
16. Let f ( x, y, z ) = x2 + y2 + z z2 and g(x, y, z) = z. Prove that the minimum of /
on g = 0 is 0 = f(0, 0, 0).
17. (cont.) Show however that the second derivative test at the end of the section is
inconclusive for this example.
10. Double Integrals
1. INTRODUCTION
A geometric motivation for the definite integral (also called simple
integral)
f f ( x) dx
J a
is the problem of finding the area under a curve y = /(#). Suppose we con
sider instead the problem of finding the volume under a surface z = f {x, y ) y
where f { x , y ) is a continuous (positive) function defined on a rectangular
domain
a < x < b, c < x < d.
This leads us to a new kind of integral called a double integral. Now the
theory behind the double integral (and the triple integral of the next chapter)
is rather technical and lengthy, so we shall postpone it until Chapter 12.
Meanwhile we shall proceed intuitively, taking a lot of things for granted. Our
present aim is to develop a working knowledge of how to set up, evaluate, and
apply double and triple integrals (called collectively multiple integrals).
370 10. DOUBLE INTEGRALS
Vol ume
Consider the prismatic solid (Fig. 1.1) bounded above by the surface
s = f ( x, y) and below by the rectangular domain a < x < 6, c < y < d.
Intuitively it seems obvious that the solid has a well-defined volume. For
we can carve the solid out of a homogeneous material like clay, plaster, or
steel and then weigh it or submerge it in a tank of water to find its volume.
This is fine, but how do we compute the volume? We shall use a procedure
analogous to rectangular approximations for plane areas.
We shall approximate the given solid by a large number of long, thin
rectangular solids, and add up their volumes to approximate the desired
volume.
Partition the base of the solid into mn equal small rectangles by drawing
segments parallel to the x- and y-axes that divide [a, 6] into m equal parts
and [c, d~\ into n equal parts (see Fig. 1.2).
d = y n
Vi
2/i1
c = y o
a = Xo Xi-i Xi x m= b
Fig. 1.2 division of the base into mn equal rectangles
Thus
a = xo < x\ < < < xm = b, Xi Xi-i =
b a
c = yo < yi < y2 < < y n = d, yj - y j - i =
d c
n
The part of the solid above the i , j -th small rectangle is approximately a
thin rectangular column of height f (xi , y j ), where (xi} yj ) is the midpoint of
the base (Fig. 1.3).
1. Int roduct i on 371
M M -
The volume of the column is
f ( xi , yj )
Add to approximate the total volume:
i=i y=i
Now partition the base into more and more rectangles by letting and n
both increase to infinity, so (b a ) / m and (d c ) / n both decrease to zero.
Then under reasonable conditions the double sum converges to a limit:
m n
m ->oo, n
z=i j =i
One reasonable condition is that / be continuous, as will be shown in Chapter
12. The double sum expression suggests the double integral notation
J J f ( x, y) dx dy = J J f ( x, y) dx dy,
a<x<b
c^y^d
372 10. DOUBLE INTEGRALS
where D denotes the rectangular domain of /. Thus the full definition of the
double integral is
where
D n >< i =1, -,m
j =1, -,n
The formula for #t-simply says that Xi is the midpoint (average) of
Xi-i = a + (i 1) ( ----- - J and = a + i
\ m /
Likewise yj = HVj - i + 2/i)-
The definition does not require that / be positive. I f / takes both positive
and negative values, the double integral represents an algebraic volume rather
than a geometric volume. The volume between the surface z = f ( x, y) and
the x, y-plane counts positively where / > 0 and negatively where / < 0.
Assuming the double integral exists, the sum formulas
2 2 kf ( *i , Vi ) = k ^ / ( * < , J/y),
[ / ( * < 2/y) + 9 (*<> Vi) ] = ^ ^ /(*' . yi ) + Y, Y , 9 ^
imply the relations
( ) \ m /
f f
kf{x, y) dxdy - k J f t i f , y) dxdy,
J J I f i x, y) + g{x, y)~\ dx dy = J J f { x , y ) d xd y + J J g { x , y ) d x d y .
D D D
Once the double integral has been defined, the immediate problem is how
to evaluate it. Sections 2 and 3 provide a solution to this problem.
2. SPECIAL CASES
I n certain cases where f ( x, y) has a particularly simple form, it is easy to
evaluate the double integral
J I f i x, y) dx dy.
D
As before, D denotes the rectangle a < x < b y c < y < d.
2. Speci al Cases 373
Suppose first that f ( x , y ) = h , a constant. Then the double integral
represents the volume of a rectangular solid of height h. This volume is h
times the area of the base:
/ /
hdx dy = h X (area D) = h(b a) (d c).
Suppose next that f ( x, y) is a function of x alone,
f i x, y) = gi x) .
Then each cross-section of the solid by a plane parallel to the 2, #-plane is an
identical plane region (Fig. 2. la, b). The volume is the area of this region times
the y-length of the rectangle:
J J g i x) dx dy = ( J j gi x) d x j i d - c).
(b) area of the section
F i g . 2.1
Similarly if f i x, y) = h{ y) ,
J J h{ y) d xd y = ( J h{ y) dy \ (b - a).
D
These formulas are special cases of the following one:
J J gi x) hi y) d xdy = ( J g{ x) dx X f hi y ) d y )
This formula gives a double integral as the product of two ordinary (single)
integrals. I t applies to such functions as
x5y7, ex~v, ex cos y , etc.
374 10. DOUBLE INTEGRALS
Fi g . 2.2 cross-sectional slab
Let us try to justify the formula intuitively. The volume of a solid (the
left-hand side) is evaluated by slicing the solid into slabs, then adding up
(Fig. 2.2a). Fix y , then slice the solid by the planes parallel to the z, rc-plane
at y and at y + dy. The result is a slab of thickness dy. The projection (Fig.
2.2b) of this slab on the z, #-plane is the region in that plane under the curve
z = z( x) = h( y) g( x) . (Remember y is fixed!) Hence its area is
L
b r b r b
z ( x ) d x = / h( y) g( x) dx = h( y) / g( x) dx.
J a J a
The volume of the slab is its area times its thickness:
dV = ( h( y ) J g( x) d x) dy = ( ^J g( x) d x ) h ( y ) dy.
On the right-hand side, the first factor is a constant, therefore adding the
slabs yields
g 0*0 dx) h( y) dy = ( ^j gi x) dx)(^j^ h( y) d y ) .
This is the desired formula.
EXAMPLE 2.1
Find J J x4y6dx dy.
0^x, yl
Solution:
J J x y d x d y = (J xl dx)(^J^ i f d y ) = g-~.
0 x, y ^1
2. Speci al Cases 375
Answer:
35
EXAMPLE 2.2
Find
I I
0 < x l
ir /2 <>y< t t
cos y dxdy.
Then
Solution: Let D denote the rectangle, 0 < x < 1 and t / 2 < y < w.
I f =(/ eda:X/ cosydy) =(eI| )(sin^| ) =(e~ 1)(~1)-
Qu e s t i o n : The answer is negative. Why?
Answer: 1 6.
EXAMPLE 2.3
Find
/ /
ex~~y dx dy.
0 < x l
-2 <,y^
Solution: Since ex~v = exe~v,
J J ex~v dx dy = ^ J ex e~v dy' j
D
= ( e x | ^= (e - l ) ( e 2 - e ).
Answer: e(e l ) 2.
EXAMPLE 2.4
Find
I I
(x2y 3xy2) dx dy.
l x2
- 1 < y < \
xy2dx dy.
Solution: Use the linear property of the double integral:
J J ( x 2y 3x y 2) d x d y = J J x 2y d x d y 3 J J
D D D
Evaluate these two integrals separately:
J J x 2y d x d y = ^ J x 2 d x ^ J y d y ^ j = 0;
D 1
J J xy2dx dy = ( J x d x ) ( ^ J y2dy) = = 1.
376 10. DOUBLE INTEGRALS
Answer: 3.
EXERCISES
Evaluate:
1. II (3x 1) dx dy;
//
> / /
/ f
/ /
//
. J J (x17 - J/17)
//
/ /
/ /
I I (x ~ y ^ dxdy
14. J J ( x y ) 7bd x d y ;
15 i f (x ~^y ^ dxdy
2 J J ^ ^X
3. x2y2dxdy\
4. x2y2dx dy;
5. f f x3y3dxdy;
y ) dxdy;
7. II (x2y2) dxdy;
7) efcdt/;
ey2) dx dy;
10. ^ cos x cos ydxdy;
11. J J (x2+ y 2) dxdy;
12. [ f ( xy) 2dxdy;
13
- 1<* <2, 0 < y < 5
l < x < 1, 0 < y < In 2
- 1 < s< 0, 0 <i /< 1
0 < x , y < 1
0 < x, y < 1
-1 < 2/ < 1
1 < x, y < 1
-1 < V < 1
l < y < i
-1 < 2/ < 1
3. It er at ed Int egral s 377
3 M '
' / M
/ /
16. 11 p dx dy; 1< x < 2, 1< y < 4
17. j / x 2 dx d y ; 0 < x < 2, 0 < ?/< 1
18. J J xy In x dxdy; 1 < x < 4, 1 < 2/ < 2
19. / / x ln(x?/) dx cfa/; 2 < x < 3, 1 < 2/ ^ 2
> / /
20. e*-H/ cos 2^ dy; 0 < x < 7r, 1 < y < 2.
3. ITERATED INTEGRALS
The formula in the last section for integrating products g ( x ) h ( y ) is a
useful one, but it is inadequate for many functions, such as f ( x , y ) = 1/ (x + y )
and/(x, y ) = y cos (#2/)- I n this section we derive the most general method for
evaluating double integrals, the method of iterated integration. This method
includes the previous rule for products as a special case.
The problem is to compute
- f f
v = / / f ( x, y ) dx dy.
a <b
c <y <d
Consider the integral as a volume to be found by slicing.
Fix a value of y and slice the solid by the corresponding plane parallel
to the Xj 2-plane (Fig. 3.1).
Fi g . 3.1
378 10. DOUBLE INTEGRALS
The resulting cross-section has area A (y). Denote the volume to the left
of this slice by V( y) . (Thus V( c) = 0, V( d) = the desired volume.) The
fundamental fact needed here is that
d V A ( \
y ~ = A( y ) .
dy
I ntuitively this is fairly clear. The derivative dV/ dy is the limit as h --------0
of
V( y + h) - Vj y )
h
For h very small, the numerator is the volume of a slab of width h and cross-
sectional area approximately A( y ) . Hence the quotient is approximately
A (y).
Now integrate dV/ d y = A (y) to find V :
- rJ c
A (y ) dy.
But A (y), the area of the cross-section (Fig. 3.2), can be expressed as a
simple integral. Indeed, A( y ) is the area under the curve z = f ( x , y ) from
x = a to x = b. (Remember y is fixed in this process.) Thus
A( y ) = f /(, y) dx,
J a
where y is treated as
Fi g . 3.2
The final result is the following pair of formulas.
It erat i on Formulas
I f / ( * , ) * #. / ( j f ( x , y ) d x j d y
=fa ^ ^ X V^ ^ dX'
a <b
c < 2/
a constant in computing the integral.
3. It er at ed Int egral s 379
(The second form is obtained by reversing the roles of x and y in the argu
ment. )
Our intuitive justification of the formulas will be replaced by an accurate
proof in Chapter 12 for continuous, and even somewhat more general, in
tegrands.
Remark: The expressions on the right are iterated integrals, i.e.,
repetitions of simple integrals. They are often written this way:
EXAMPLE 3.1
l <^<2
Solution:
Now for fixed x ,
f ;
= In (a; + y)
x + y
l n(2 + x) ln(l + x).
y = 1
Hence
[I n(2 + x) ln(l + #)] dx.
But
J In u du = u In u u + C,
therefore.
i
= [ ( 2 + x) I n(2 + x) - (2 + x) - (1 + x) ln(l + x) + (1 + #)]
o
= 3 In 3 - 2 In 2 - 2 In 2 = 3 In 3 - 4 In 2.
i 27
Answer: In
16
An important feature of the I teration Formulas above is that the iteration
may be done in either order. Sometimes the computation is difficult in one
order but relatively easy in the opposite order.
380 10. DOUBLE INTEGRALS
EXAMPLE 3.2
I I
Find / / y cos (xy) dx dy.
O^x^l
0y<>T
Solution: Here is one set-up:
i i - a i ; y cos (xy) dy^dx.
The inner integral,
y cos (xy) dy,
Jo
while not impossible to integrate (by parts), is tricky. The alternate pro
cedure is iteration in the other order:
a - i : a : y cos (xy) dy.
Since y is constant in the inner integration, this can be rewritten as
I f f ( f cos (xy) dx^dy.
Now
f 1 / x 7 i / \ I si n y
/ cos (xy) dx = - sin( xy) = ------.
Jo y
Hence
Ix0
f [ = r y y d y - f * sin y dy = 2.
J J J o y Jo
Alternate writing of the solution:
f f y cos (xy) d xdy = J y dy f cos( xy) dx
D
. f \ (* | V . - 2. '
y0 \ y I o/
Answer: 2.
EXERCISES
Evaluate:
- f f ^ - r * >
3. It er at ed Int egral s 381
2. J J ^ d x d y ; 0 < x < 1, 1 < y < 5
> ' / /
~2d x d y ; 1< a: < 1, 1< 2/ < 3
4 . / / . - * < * - . <, <<>
5. i/2sin (a??/) d x d y ; 0 < x < 27r, 0 < y < 1
6. (1 2x) sin(i/2) d x d y ; 0 < x , y < 1
7. y j sin (x/y) dx dy; 7r/2 < < 7r/2, 1< 1/ < 2
s. /y d - x + 2y)2 dxdy; 3 < a; < 4, 1< 1/ < 2
* 7 / -
> / /
9. J J (l + x + y)( 3 + x y) dxdy; 2 < x , y < 3
10. y^* sin (a; + 2/) dx dy; 0 < a;, y < 7r/2
11. / / < * + i/)wdx dy; 0 < a; < 1, 1 < 2/ < 2
12. J J (x ?/)ndx dy; 0 < a?, y.< 1.
13. Suppose/(a;, y) = /(as, 2/)- Prove J J fi x, y) dx dy = 0 ; 1 < *, y < 1.
14. Suppose / (a;, y ) = f(x, y) . Prove J J f(x, y) dx dy = 0; 1 < x, y < 1.
15. Find the constant A that best approximates f(x, y) on the square 0 < x, y < 1 in
the least squares sense. In other words, minimize
I f
[/(>2/) dx dy; 0 < x, y < 1.
16. (cont.) Find'the least squares linear approximation
A + Bx + Cy
to f(x, y) = xy on the square 0 < x, y < 1.
17. (cont.) Show that the coefficients of the least squares linear approximation
382 10. DOUBLE INTEGRALS
A + Bx + Cy to / (x, y) on the square 0 < x, y < 1 satisfy
A + \ B + l c = J J f dx dy
+ + ~ J J xf dx dy
\ a + \ b + 1- C - J J yf dxdy.
4. APPLICATIONS
Ariass and Density
Suppose a sheet of non-homogeneous material covers the rectangle
a < x < b and c < y < d. See Fig. 4.1. At each point (x, y) , let p(x, y)
denote the density of the material, i.e., the mass per unit area. (Dimen-
sionally, planar density is mass divided by length squared. Common units
are gm/cm2 and lb/ft2.)
The mass of a small rectangular portion of the sheet (Fig. 4.2) is
dM p(x, y) dx dy.
Therefore the total mass of the sheet is
y
c -
a b
x
Fi g . 4.1 noil-homogeneous sheet
a <x <b
c d
The density (lb/ft2) at each point of a one-foot square of plastic
is the product of the four distances of the point from the sides of
the square. Find the total mass.
EXAMPLE 4.1
4. Appl i cati ons 383
Solution: Take the square in the position 0 < x, y < 1. Then
p(x, y) = a;(l - x) y ( 1 - y) ,
M =J J p(x, y) dxdy
( * - ) * ) H - s -
Answer: lb.
36
Moment and Center of Gr avi t y
Suppose gravity (perpendicular to the plane of the figure) acts on the
rectangular sheet of Fig. 4.1. The sheet is to be suspended by a single point
so that it balances parallel to the floor. This point of balance is the center of
gravity of the sheet and is denoted x = (x, y).
The center of gravity x is found in three steps:
(1) Find the mass M.
(2) Find the moment of the sheet with respect to the origin. This is
the vector
Here D denotes the region the sheet occupies, a < x < b and c < y < dy
and p = p(x, y) is the density.
(3) Divide the moment by the mass to obtain the center of gravity:
384 10. DOUBLE INTEGRALS
1
This formula will be derived after two examples.
EXAMPLE 4.2
Find the center of gravity of a homogeneous rectangular sheet.
Solution: Homogeneous means the density p is constant. Take the
sheet in the position 0 < x < a and 0 < y < b. The mass is M = pab, and
the moment is
This is the midpoint (intersection of the diagonals) of the rectangle. (Of
course the rectangle balances on its midpoint; no one needs calculus for this,
but it is reassuring that the analytic method gives the right answer.)
The center of gravity is
Answer: The midpoint of the rectangle.
EXAMPLE 4.3
A rectangular sheet over the region 1 < x < 2 and 1 < y < 3 has
density p( x, y ) ~ xy. Find its center of gravity.
Solution: The mass is
The moment is
4. Appl i cati ons 385
m=J J xy x ^ =i ^ J J x2y J J xy2 dx
= ( J * x 2d x f * y d y , J * X d x J ' y> d y )
( 1 3 26\ 1 .
- ( s ' 4- 2- j ) - 3 <28' 39) -
Therefore
( I t )
Note that x is inside the rectangle, a little northeast of center. Could you
have predicted this?
Answer: x
\ 9 6 /
The formula for center of gravity is derived by balancing the rectangular
sheet on various knife edges.
Suppose a knife edge passes through x and the sheet balances (Fig. 4.3).
Divide the sheet into many small rectangles. The turning moments of these
pieces about the knife edge must add up to zero.
386 10. DOUBLE INTEGRALS
Let n be a unit vector in the plane of the rectangle perpendicular to the
knife edge. A small rectangle with sides dx and dy located at x has (signed)
distance (x x)*n from the knife edge and has mass pdxdy . Hence its
turning moment is
p(x x)*n dx dy.
The sum of all such turning moments must be zero:
//
p(x x)*n dx dy = 0.
Since x and n are constant, this relation may be written
n J J p x d x d y = (n*x) J J p d x d y ,
or
ri* m = Mn* x.
This equation of balance is true for each choice of the knife edge (each
choice of the unit vector n). Hence,
n* (m Mx) = 0
for each unit vector n. This means the component of m Mx in each direction
is zero. Therefore,
m Mx = 0,
m = Mx.
EXERCISES
Find the volume under the surface and over the portion of the x, f/-plane indicated.
Draw a figure in each case:
1. z = 2 (x2+ y2); I < x , y < 1
2. z = 1 xy\ 0 < x, y < 1
3. z = x2 f y2; . 0<z <2, 0 <2/ <l
4. z = sin x sin y; 0 < x, y < tt
5. z = x2y + y2x; l ^z ; <2, 2 < y < 3
6. 2 = (1 + x*)y2; 1 < x, y < 1.
A sheet of non-homogeneous material of density p gm/cm2 covers the indicated rec
tangle. Find the mass of the sheet, assuming lengths are measured in centimeters:
7. p = 3(1 + x) (1 + y); 0 < x, y < 1
8. p = 10.2xy; 0 < x < 1, 1 < y < 1.5
9. p = 3 + O.lz; 2 < a; < 3, 1< 2/ < 1
10. p = 4e*+* - 2; 0 < * < 1, 0 < y < 0.5.
Find the center of gravity of each rectangular sheet, density as given:
11. p = (1 x) ( l y) + 1; 0 < x, y < 1
12. p = sin x; 0 < x < 7r, 0 < y < 1
5. General Domai ns 387
13. p = sinz(l sin y) ;
14. p = 10 -
15. p = 1 + z2+ i/2;
16. p = 2 + xV;
7r/2 < X, y < 7T
0 < x, y < 1
- 1 < X< 1, 1 < 2/ < 4
1 < x < 1, 0 < 2/ <l .
5. GENERAL DOMAINS
Suppose we want the volume of a solid (Fig. 5.1) bounded on top by a
surface z = f ( x, y) defined over a non-rectangular domain D. Here the bottom
of the solid is D itself, and the side of the solid is the cylindrical wall with
generator parallel to the 2-axis and base the boundary of D. The solid is the set
{(*, V, z) | (x, y) e D and 0 < 2 < /(*, y ) \ .
LZ
Fi g . 5.1 solid under a surface
y
We shall suppose the volume is
V - J J /(, y) dx dy,
D
where this double integral over D has properties consistent with our intuitive
ideas about volume. Let us consider some examples to see how we can compute
such volumes and what properties the double integral ought to have.
EXAMPLE 5.1
Find the volume under the surface z = e~(x+y), and over the domain
of the y-plane bounded by the x-axis, the line y x, and the
lines x = f and x = 1.
388 10. DOUBLE INTEGRALS
Solution: Draw figures of the domain (Fig. 5.2a) and the solid (Fig.
J 5.2b). The problem is solved by the slicing method. Slice the solid into slabs
by planes perpendicular to the #-axis. Let the area of the slab at x be denoted
by A (x). See Fig. 5.3a. The volume of the slab is
so the total volume is
dV = A (x ) dx,
V = I A( x) dx.
J 1/2
Now carefully draw the cross-section (Fig. 5.3b). I ts base has length x, and
it is bounded above by the curve z = e~xe~y. Note that x is constant; in this
plane section 2 is a function of y alone. This is the absolute crux of the matter I
(a) (b)
Fi g . 5.3
The area of the cross-section at x is
A (x) = I e~xe~y dy = e~x j e~y dy e~x(e~y)
Jo Jo
y=x
y0
5. General Domai ns 389
Thus
V = [ (e~x e~2x) dx = ( - e~2x e~x\ = - e~2 e~l + e~112.
J 1/2 \ 2 / 11/2 2 2
No t e : The solution can be set up as follows:
F = / ( / e ~ i x + v ) d y ) d x '
I n the inner integral the variable of integration is y , while x is treated like a
constant, both in the integrand and in the upper limit. But once the inner
integral is completely evaluated, x becomes the variable of integration for
the outer integral.
1
Answer: ~e~2(l 3e + 2e3/2).
EXAMPLE 5.2
Find the volume under the surface z = 1 x2 y2, lying over
the square with vertices ( 1, 0 ) and (0, 1).
Solution: First draw the square (Fig. 5.4a). Observe that by sym
metry, it suffices to find the volume over the triangular portion in the first
quadrant, and then to quadruple it.
(a)
Fi g. 5.4
(b)
390 10. DOUBLE INTEGRALS
Draw the corresponding portion of the solid (Fig. 5.4b). Slice by planes
perpendicular to the x-axis. For each x, the plane cuts the solid in a cross-
section (Fig. 5.5) whose area is
A( x )
- r
Jo
(1 - x2 - y2) dy.
Fi g . 5.5
The volume of the solid is
V = 4 [ A (x) dx = 4 J A (x) dx = 4 J (1 x2 y2) dy^j dx
- 4 f
= 4 Jo [ ^ _ ~ ~~ ~ ^ _
= 4 J |\ x x2 + a;3 ^(1 a;)3j dx
T 1 1 1 1 1 4 4
= 4 1 -------------1----------- ---- 4 = - .
L 2 3 4 12j 12 3
dx
Answer:
Find the volume under the plane z =* 1+ x + y> and over the
domain bounded by the curves x = J , x = 1, y = x2, y = 2x2.
EXAMPLE 5.3
Solution: The domain and the solid are drawn in Fig. 5.6. The cross-
section by a plane perpendicular to the x-axis at x is a trapezoid (Fig. 5.7).
5. General Domai ns
z = 1 + x + y
Note the range of y } namely .t2 < y < 2x2. Thus
V = f A( x ) dx,
J 1/2
where
" ( *
i
(1 + x + y) dy = ( y + xy + - y2 A (x) = I
J xi
= (2x2 + 2x3+ 2x4) ^#2+ x3+ - x =
y=2x2
X2 + Xs +
Finally,
t
o

i
c
o
392 10. DOUBLE INTEGRALS
Answer:
49
60
Iterati on
With these examples behind us, we are prepared for the statement and
solution of the double integration problem. Suppose a domain D in the x , y -
plane is bounded by lines x = a and x = 6, and by two curves y = g ( x )
and y = f ( x ) . Assume a < b and g ( x ) < f ( x ) for each x . See Fig. 5.8. Suppose
a surface z = H( x, y) is given, defined over D. See Fig. 5.9. The volume of
the solid column over D and under the surface is then
V
=/ / < ,
y) dxdy.
(As is usual with integrals, the portion where H < 0 is counted negative.)
To evaluate the double integral, consider a slab parallel to the y, 2-plane
at x. I ts face area is
//(*>
A( x) = / H( x, y) dy.
*Oix)
See Fig. 5.10. I n this integration, x is constant. Notice that y, the variable of
integration, disappears when the definite integral A (x) is evaluated.
5. General Domai ns 393
Conclusion: the volume integral is
r-f)
D
f H( x , y ) d x d y = J A( x ) d x = j
C O
f f (x) \
f H ( x , y ) d y ) d x .
0(x) /
This is called the iteration formula. I n Chapter 12, we shall prove that it
holds for any continuous function, not necessarily positive.
EXAMPLE 5.4
Find the volume under the surface z = xy, and over the domain D
bounded by y = x and y = x2.
Solution: The line and parabola intersect at (0, 0 ) and (1, 1). See
Fig. 5.11. For each value of x, the range of y is
x2 < y < x.
394 10. DOUBLE INTEGRALS
Hence
V = J J xy dxdy = J ^ J ^xy d y ) dx = J Q xy21 ^dx
D O x 0 y x
Jo 2 2 \4 6/ 24
Alternate Solution: The domain D may be thought of as bounded by
the curves x = y (below) and x = \ / y (above), where 0 < y < 1. See
Fig. 5.12. For each y y the range of x is y < x < \ / y . Therefore the set-up of
the iteration is
- i h y' - y,) dy- l (l - i ) ~ k-
The iteration method does not apply directly to every example. The
boundary of D may be too complicated, in which case you must break the
domain into several smaller regions, and deal with each as a separate problem.
I n Fig. 5.13 two examples are shown. The set-up for the domain in Fig. 5.13a is
I f
d 2
J J H( x , y ) dxdy = J J H{ x , y ) dx dy + 11 H (x, y) dx dy
D Di
/ l(z) \ f b / f
H ( x , y ) d y j d x + ( / H ( x , y ) d y
/ J c g(x)
5. General Domai ns 395
Fig. 5.13 cut up the domain
The set-up for the domain in Fig. 5.13b is
J J H ( x , y ) d x d y
J J H( x , y ) d x d y + J J H( x, y ) d x d y + J J H( x , y ) d x d y
Di D2 Da
m
/l(*) \ fc / f h ( x ) \
H ( x , y ) d y ) d x + I / H (x, y) d y ) dx
2(x) / J e \ J qi(x) /
e \ J m( x)
+ H r H ( x , y ) d y ) d x .
J c 02(x) /
EXAMPLE 5.5
Without evaluating, set up the calculation for
dx dy
II
x2-I- y2
over the region indicated in Fig. 5.14.
Solution: Draw a vertical line from (1,0) to (1, 2 ). Call the left-hand
part Di and the right-hand part D2. See Fig. 5.15. Then
C f d x d y = f f d x d y C f
J J x 2+ y 2 J J x 2+ y 2 J J
D Di D2
d x d y
2_|_ yi _r JJ xi _|_ yi
D2
396 10. DOUBLE INTEGRALS
Ar ea
I t is often convenient to compute areas by double integrals, since
*
J J I dxdy = area(D).
D
EXAMPLE 5.6
Find the total area A bounded by y = x2 and y = x\
A - J f d , d y - ( <* ) * - | . .
5. General Domai ns 397
Solution: Draw the domain (Fig. 5.16). Then
Alternate Solution:
ri / r<n
Answer: A = .
15
EXERCISES
Find the volume under the surface z = f (x, y), and over the indicated domain of the
x, y-plane:
1. * =l ; Fig. 5.17
2. z = y; Fig. 5.17
3. 2 = x; Fig. 5.17
4. z = l + x + y, Fig. 5.17
5. z = x2; Fig. 5.18
6. z = xy; Fig. 5.18
7. 2 = y2; Fig. 5.18
8. 2 = (x - y)2; Fig. 5.18
(Exs. 1-4) Fi g. 5.18 (Exs. 5-8) Fi g . 5.17
398 10. DOUBLE INTEGRALS
9. 2 = 1; Fig. 5.19
10. 2 = 1+ x; Fig. 5.19
11. z = y, Fig. 5.19
12.2=1; Fig. 5.20
13. 2 = 1+ x; Fig. 5.20
14. 2 = x2y; Fig. 5.20
15. 2 = y3; Fig. 5.20
16. z = x2y2; Fig. 5.20
17.2=1; Fig. 5.21 18. 2 = 1+ *; Fig. 5.21.
Fi g. 5.21 (Exs. 17-18)
19. Describe each of the domains in Figs. 5.17-5.21 by inequalities between x and y.
Compute the double integral over the domain whose boundary curves are indicated; in
each case draw a figure.
6. Pol ar Coordi nates 399
20. J J xexy dx dy; x = 1, x = 3, xy 1, xy = 2
21. J J x2y dxdy; y = 0, x = 0, x = (y l )2
22. J J (x3 + i/3) dx dy; x2 + y2 = 1
23. J J (x + y) 2dxdy; x + y = 0, y = x2 + x
24. ^ (1+ xy) dx dy; y = 0, y = x, y = 1 x
25. / / y2dx dy; y = =tx, y = |x + 3.
26. Find J J (1 + xy2) dx dy over the domain bounded by the parabola x = y2 and
D
the segments from (2, 0) to (1, 1).
6. POLAR COORDINATES
Suppose a domain (Fig. 6.1a) in the x, y-plane is described in polar co
ordinates b y r0 < r < ri and do < 6 < 0i. Suppose a surface (Fig. 6.1b) is
given by a function z = /(r, 6) over this domain. What is the volume of the
solid between the surface and the domain?
Fi g. 6.1
400 10. DOUBLE INTEGRALS
Split the interval r0 < r < ri into small pieces. Do the same for 60 < 6 < di.
The result is a decomposition of the plane domain into many small, almost
rectangular, regions (Fig. 6.2 ). Each has dimensions dr and rdO, hence area
dA = r dr dd. The portion of the solid lying over this elementary rectangle
has height z = /(r, 0), hence its volume is dV = /(r, 6) r dr dd. See Fig. 6.3.
i y
Fi g . 6.2 elementary polar area
The total volume is
1
Fi g . 6.3 rectangular column
v =f j K r e)rdrde = i:u: f i r , 6 ) r d r j d d
D * r
= J ' l r ( J ' f ( r , 6) do) dr.
EXAMPLE 6.1
Find the volume under the cone z = r, and over the domain
0 < r < a and 0 < $< t t /2 . Solution: The solid in question is shown in Fig. 6.4. I ts volume is *12 m a3 7r 3 ------- , 3 2 Answer: V = 6 6. Pol ar Coordi nates 401 Mass and Center of Gr avi t y in Pol ar Coordinates The mass and the center of gravity of a non-homogeneous sheet over the domain in Fig. 6.2 can be computed by double integrals. Since the element of area is dA = r dr dB, the elementary mass is dM = p(r, 0) r dr dd. Note that p must be expressed in polar coordinates. Thus To find the moment, express x in polar coordinates: x= (x>y) = (r cos 0, t sin 6). Then x = m/Af, where m = JJ x d M = JJ x p r d r d d JJ (r cos 6, r sin 0)pr dr < D D r o ^ r ^ n eo^e^ei Find the center of gravity of a homogeneous sheet in the shape of the semicircle 0 < r < a and 0 < 6 < r. EXAMPLE 6.2 402 10. DOUBLE INTEGRALS Solution: Here p is constant. The mass is M - ira2p. 2 The moment is m = 11 xpr dr dd =JJ xprdrdd = JJ (r cos d, r sin d)pr dr dd =p ^JJ r2 cos 6 dr dd, JJ r2 sin 6 dr dd'j =p ^J r2dr J cos 6dd, J r2dr J sin ddd'j = p ^0, ^ a^j . Therefore 1 2 x = m = - M ttcl2 ( - 1 *) Answer: x (*) EXAMPLE 6.3 Find the center of gravity of a sheet with density p = r covering the quarter circle 0 < r < a and 0 < d < w/2. Solution: The mass is M = J J pr dr dd = J J r2dr dd = J r2dr J dd = . The moment is = J J xpr dr dd = J J (r cos d, r sin d) r 2dr dd m fa [ic/2 = I r3dr I Jo Jo (cosd, si nd) dd = (1, 1). Therefore 1 x = i m = - 3 T M 7r a34 Answer: x / 3a 3o \ = \2^ 2^/ Could you have predicted that x lies on the line y = x and |x| > a/2? 6. Pol ar Coordi nates 403 y Fi g . 6.5 domain between two polar curves Ot her Domains Now consider (Fig. 6.5a) a domain D defined by inequalities g(fl) < r </(0), do < 0 < 0i, where/(0) and g(d) are continuous functions and g{6) < /(0). The iteration formula for such a domain is J J H( x, y) dx dy = J J H (r cos 0, r sin 0) r dr dd D D - A f J df) \ / 7*= fr=/(0) (r cos 0, r sin 0) r dr ! r =g( 0) > 404 10. DOUBLE INTEGRALS EXAMPLE 6.4 r r dx dy Set up in polar coordinates the integral / / -- .....- over the JJ x2 + y2 D domain D of Fig. 5.14. Do not evaluate. Solution: The domain D splits naturally into two regions, one the reflection of the other in the line y = x; the function is symmetric in this line (Fig. 6.6). I n polar coordinates, the line x = 2 is r cos 6 = 2 or r 2 sec 6. Consequently, the lower region Di is determined by 0 < 6 < 7r/4 and 1 < r < 2 sec 6. The set-up is f C dxdy f f r dr dd JJ x2 + y2 JJ r2 D Di Cr ! 4 / [ 2 s eed d \ p / 4 = 2 J j d0 = 2 J I n(2 sec 0) dO. EXAMPLE 6.5 Compute xy dx dy over the domain D defined by r = sin 20: r = sin 26 Fi g . 6.7 Solution: Draw a figure (Fig. 6.7). I n polar coordinates, 6. Pol ar Coordi nates 405 hence J J xy dx dy = J J - r2 sin 2dr dr dO = ^J (sin 26) ^J r3dr^j dd D D 0 0 1 f T/2 / I \ 1 f r /2 1 = ~ (sin 20) I - sin420 J d6 = - / sin526 d6 = / sin5a da 2 A \4 / 8 70 16 J0 1 f r/ 2 1 2*4 1 (by tables). Answer: 15 EXAMPLE 6.6 Find the volume common to the three solid right circular cylinders bounded by x2 + y2 = 1, y2+ z2 = 1, + x2 = 1. (a) (b) Fi g . 6.8 three intersecting cylinders Solution: The first problem is drawing the solid. By symmetry, only the portion in the first octant need be shown. I n Fig. 6.8a the common part of y2 + z2 = 1 and z2 + x2 = 1 is shown. The base of the third cylinder x2 + y2 = 1 is dotted. I n Fig. 6.8b this third cylinder is drawn, cutting off the desired solid. One sees that the solid is symmetric in the plane x y, hence only the half to the left of this plane is needed (Fig. 6.9a). This is the solid under the surface z = y / \ x2 and over the wedge-shaped domain D shown in Fig. 6.9b. 406 10. DOUBLE INTEGRALS (a) Fi g . 6.9 (b) V Compute the volume using polar coordinates: = 16 j j y / 1 x2dx dy = 16 j j y / l r2 cos26 r dr dd = 16 j j j ry/ 1 r2cos26 drj dd n U - f f |r=0-l 3 J 0 sin36 cos26 dd. Now 1 sin36 1 sin36 1+ sin 6 + sin26 . 1 = sm 6 + cos20 1 sin20 From tables, 1+ sin 0 1+ sin 6 f d O ( i r 0 \ / / -------; = tan I ------- ) + C, 1 J l + sin0 \4 2/ hence 16 f / tt 0 \ H t/4 16 T y / 2 7r 1 T L 1 _ T t an8 + 1J 3 L 2 2 + V 2J Answer: 8(2 y/ 2) . 6. Pol ar Coordi nates 407 N o t e : The expression for tan(71-/8) comes from the half-angle formula tan(0/2) = sin 0/(1 + cos 0). Find the volume over the region given in polar coordinates and under the indicated surface. Sketch: 4. Use polar coordinates to find the volume of a hemisphere of radius a. 5. Find the volume of the region bounded by the two paraboloids of revolution z = x2 + y2 and z = 4 3 (x2 + y2). 6. Find the volume of the lens-shaped region common to the sphere of radius 1 centered at (0, 0, 0) and the sphere of radius 1centered at (0, 0, 1). 7. A drill of radius b bores on center through a sphere of radius a, where a > 6. How much material is removed? 8. Find the center of gravity of a sheet with uniform density in the shape of a quarter circle 0 < r < a and 0 < 0< t / 2 . 9. (cont.) Now do it the easy way, using the result of Example 6.2. 10. Find the center of gravity of a homogeneous wedge of pie in the position 0 < r < a and 0 < 0 < a. 11. (cont.) Hold the radius a of Ex. 10 fixed, but let a --------> 0. What is the limiting position of the center of gravity? 12. A quarter-circular sheet in the position 0 < r < a and 0 < 0< t / 2 has density p = a2r2. Find its center of gravity. 13. Find the volume under the surface z = x4?/4, and over the circle x2 + y2 < 1. 14. Find the volume under the surface z = r3, and over the quarter circle 0 < r < 1, 0 <0 <t / 2 . 15. Find the volume under the cone z = 5r, and over the rose petal with boundary r = sin 30, 0 < 0< t / S . 16. Find the area bounded by the spiral r = 0, 0 < 0< 2t , and the x-axis from 0 to 2 t . 17. Evaluate over the unit disk 0 < r < 1: EXERCISES 1 . z = x2+ y2; l < r < 2 , 0 < 0< t / 2 2 . z = x ) 0 < r < l , - 7 r / 2 < 0 < 7 r / 2 3. z = xy; 1< r < 2, tt/4 < 0< t / 2 . 18. Evaluate over the unit disk 0 < r < 1: 19. Evaluate over the unit disk 0 < r < 1: 408 10. DOUBLE INTEGRALS 20. Evaluate over the unit disk 0 < r < 1: / / n S r ? - / / * * 21. Find the volume common to two right circular cylinders of radius 1intersecting at right angles on center. 22. (cont.) Find the volume of the region in space consisting of all points (x, y, z ), where one or more of the following three conditions hold: (1) x2+ y 2< 1, - 2< z< 2 (2) y2+ z2< 1, - 2< z< 2 (3) z2+ x2< l , - 2< y < 2. Use the results of Ex. 21 and Example 6.6; after that, its pure logic, not calculus.] The game of jacks has objects more or less in the given shape. 23. Evaluate the integral in Example 6.6 using rectangular coordinates. 11. Multiple Integrals 1. TRIPLE INTEGRALS I n this chapter we shall learn how to set up and apply triple integrals. As with double integrals, we shall work intuitively, and postpone the theory behind multiple integration until Chapter 12. Let us consider the following problem. Suppose we have a bounded domain D in space filled by a non-homogeneous solid. At each point x the density of the solid is 8 (x) gm/cm3. What is the total mass? Our previous experience with problems of this type on the line and in the plane leads us to expect an answer of the form - / / / D M = 8(x) dx dy dz, where the triple integral is defined by decomposing D into many small boxes, forming a suitable triple sum Y X X 5(*> (~Xi - Xi- 1) - y s - 0 ~ and taking limits. The theory says that this process does indeed define the triple integral, and that in practice the triple integral can be evaluated by iteration. Here is how iteration works when the domain is the part of a cylinder bounded between two surfaces, each the graph of a function. Precisely, suppose two surfaces z = g ( x , y ) and z = f ( x, y) are defined over a domain S in the x, y-plane, and that g(x, y) < f ( x, y) . See Fig. 1.1. These surfaces may be considered as the top and bottom of a domain D in the cylinder over S. Thus D consists of all points (x, y, z) where (#, y) is in S and g( x, y) < z < f i x, y ) . I n this situation the iteration formula is f f f 8( x) dx dy dz = f [ ( [ f>(x,y, z) dz \ dx dy. J J J J J \ J z=g( x, y) ' D 410 U . MULTIPLE INTEGRALS R e m a r k : Some prefer the notation r r r / ( x, y ) / / dx dy / 8(x, y, z) dz. JJ J g(x, y) The reason for the iteration formula is illustrated in Fig. 1.2. First, the little pieces of mass 8 (x, y , z ) dx dy dz in one column are added up by an integral in the vertical direction (x and y fixed, z variable). The result is 8 (x, y, z) dz J dx dy. 0( x, y) / Then the masses of these individual columns are totaled by a double integral over S. EXAMPLE 1.1 Find j j j (,x2+ y2) dx dy dz, taken over the block 1 < x < 2, 0 < y < 1 , 2 < z < 5 . Solution: The upper and lower boundaries are the planes z = 5 and 1. Tri pl e Int egral s 411 /(*, y) r s(x) dx dy dz 9(x, y) z = 2. Therefore I f f (x* + y ) dy dz = m i : (x2 + y) dz'j dx dy, s where S is the rectangle 1 < x < 2 and 0 < ?/ < 1. Now x and y are con stant in the inner integral: fJ 2 (a;2+ y) dz = 3 (a;2+ 2/), hence JJJ (x2 + y) dx dy dz = j j 3 (#2+ y) dx dy s Answer: 17 EXAMPLE 1.2 Compute JJJ xzy2z dx dy dz over the domain D bounded by x = 1, x = 2; y = 0, y = #2; and 2 = 0, 2 = 1/jc. 412 U . MULTIPLE INTEGRALS Solution: The domain D is the portion between the surfaces z = 0 and z = 1/ x of a solid cylinder parallel to the 2-axis. The cylinder has base S in the x, y-plane, where S is shown in Fig. 1.3. The solid D itself is sketched in Fig. 1.4. (A rough sketch showing the general shape is quite satisfactory.) A * Q / O o y = = x2/ 1 / 1 0 | ! Fi g . 1.3 I ^ Fi g . 1.4 The iteration is rZy2z dx dy dz J J J xzy2z dx dy dz = J J ( J xzy2z dz \ dx dy D S = J J x3y2 0 z21 ^ d x d y = ^ J J xy2dx dy 48 v 48 16 Fi g . 1.5 1. Tri pl e Int egral s 413 Alternate Solution: The domain may be considered as the portion between the surfaces y 0 and y = x2 of a solid cylinder parallel to the y-axis. The cylinder has base T in the z, x-plane (Fig. 1.5). From this viewpoint, the first integration is with respect to y ; the iteration is J J J xzy2z dx dy dz = J J xzz ^ J y2dy^j dx dz = J J ^ D T T x9z dx dz i y*2 / ri/x \ 1 f 2 1 ) * } , ' s ' * - 1 ' - a - Answer: 85 16 R e m a r k : I t is bad technique to consider the region as a solid cylinder parallel to the x-axis, because the projection of the solid into the y, 2-plane breaks into four parts. Therefore, the solid D itself must be decomposed into four parts, and the triple integral correspondingly expressed as a sum of four triple integrals (Fig. 1.6). The resulting computation is much longer than that in either of the previous solutions. 1/ / f / Fi g . 1.6 I n practice, try to pick an order of iteration which decomposes the required triple integral into as few summands as possible, hopefully only one. The typical summand has the form rh( x) / f f ( x ,y) p(x, y, z) dz ' rb r rh(x) / //(*,*) J a LJ k( x) g( x, y) z ) d y \ dx. 414 U . MULTIPLE INTEGRALS (Possibly the variables are in some other order.) Once the integral rhx.v) / p(x, y, z) dz J o( x, y) is evaluated, the result is a function of x and y alone; z does not appear. Like wise, once the integral CHx) / r f ( x , J k(x) \ J g( x, y V) \ p(x, y, z) dz ) dy y) / is evaluated, the result is a function of x alone; y does not appear. Remember there are six possible orders of iteration for triple integrals. I f you encounter an integrand you cannot find in tables, try a different order of iteration. Domains and Inequali ties A domain in the plane or in space is frequently specified by a system of inequalities. A single inequality f ( x , y , z ) >0 determines a domain whose boundary is f ( x , y , z ) = 0. To find the domain described by several such inequalities, draw the domain each describes, then form their intersection. EXAMPLE 1.3 Draw the plane domain given by x + y < 0, y > x2 + 2x. Solution: The first inequality determines the domain below (and on) the line x + y = 0. See Fig. 1.7a. The second inequality determines the domain above (and on) the parabola y = x2 + 2x. See Fig. 1.7b. The line and parabola intersect at (0,0) and ( 3,3). The domain satisfying both in equalities is shown in Fig. 1.7c. = 0 (b) F i g . 1.7 y = x 2 + 2x (c) j y = x 2 + 2x 1. Tri pl e Int egral s 415 EXAMPLE 1.4 Describe the plane domain S defined by 0 < x < y < a. Write the integral j j f { x , y ) dx dy as an iterated integral in both orders, s Solution: The region is described by the three inequalities x > 0, y > x, y < a. Draw the corresponding domains and take their intersection, a triangle (Fig. 1.8). i V x > 0 X a y < a x To integrate first on x, arrange the inequalities describing the triangle in hi s order: 0 < y < a, 0 < x < y. Think of y as fixed in the interval [0, a]. Then x runs from 0 to y, leading to / /(, y) dx. Jo 'Jow sum these quantities for y running from 0 to a. The result is f(x, y ) dxj dy. To integrate first on y , describe the region in the other order: 0 < #< a, x < y < a. 416 U . MULTIPLE INTEGRALS Now the result is j j f i x, y) d x d y = m f i x, y ) d y ) dx. S O x Answer: The triangle with vertices (0,0), (0, a), (a, a); tv: f i x, y ) d x j dy = m f i x, y ) d y ) dx. R e m a r k : A special case is interesting. Suppose/(x, y) =g( x) , a func tion of x alone. Then the right-hand side is l ( ^ l g( x) d y ) dx = J g(x) ( ^J d y ) dx = J (a - x) g( x) dx. The following formula is a consequence: I (^1 g(x) d x ) dy = J (a - x) g( x) dx. Thus the iterated integral of a function of one variable can be expressed as a simple integral. Tetrahedra If a domain D is specified by inequalities, it may be possible to arrange the inequalities so that limits of integration can be set up automatically. For example, suppose the inequalities can be arranged in this form: a < x < b, h(x) < y < k( x), g ( x , y ) < z < f ( x, y) . Then r r r r b r m / r m , v) x -i 8 (x, y, z) dx dy dz = / I / 8 (x, y, z) d z j d y l dx. J J J J a - J h(x) \ J g(x, y) / J D Tetrahedral domains can be expressed by such inequalities, and they occur frequently enough that it is useful to practice setting up integrals over them. EXAMPLE 1.5 A tetrahedron T has vertices at (0,0,0), (a, 0,0), (0,6,0), (0, 0, c), where ayb, c > 0. Set up J j j 8 (x, y , z) dx dy dz as an T iterated integral. x- + l + z- = i. a b c 1. Tri pl e Int egral s 417 Solution: The slanted surface (Fig. 1.9) has equation The domain is defined by the inequalities 0 < 0 < ?/, 0 < 2, h 7 H ^ 1. a b c Any order of iteration is satisfactory; for instance, choose the order of integra tion / [ / ( / p(x, y, z ) d z j efo/J dx. To find the limits of integration, arrange the inequalities in this form: a < x < by h( x) < y < k( x), g( xf y ) < z < f ( x , y ). If x and y are fixed, then If x is fixed, then . < sr < i ( l | < 6 ( l . Finally, 0 < x < a ^1 ^-^< a. The resulting system of inequalities (equivalent to the original system) is 0 < < a, 0 < y < b ^1 , 0 < z < c ^1 - . The corresponding iteration is r r r r r r K1~ I) / A 1 ~ I - 5) \ -j JJJ p(x, y, z) dx dy dz = J J y j p(x, y, z) dzJ d y J cte. 418 U . MULTIPLE INTEGRALS EXERCISES Evaluate the triple integral over the indicated domain: 1. JJJ ^ ~ z) dx dy dz; solid cylinder under z =2 x2, based on the triangle with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0) 2. f j j y t e l , * ; solid cylinder under , - * + V, b*d the .rinngle in E , 1 / / / dx dy dz; pyramid with apex (0, 0, 1), based on the square with vertices (dzl, 1) 4. The same, except the square base has vertices (dzl, 0), (0, d=l) 5. [ [ f a z dx dy dz; the domain lies in the first octant and is bounded by x = 0, z=0, y = x2, and z = 1y2 6. J J J (3x2z2) dx dy dz; the domain in the slab 0 < y < 1bounded by z = y2, z =y2, and y = x, y = x. 7. A solid cube has side a. Its density at each point is k times the product of the 6 distances of the point to the faces of the cube, where k is constant. Find the mass. 8. Charge is distributed over the tetrahedron with vertices 0, i, j, k. The charge density at each point is a constant k times the product of the 4 distances from the point to the faces of the tetrahedron. Find the total charge. 9. Find ( i ( (x + y + z)2dx dy dz over the domain in the first octant bounded by the coordinate planes, the plane x + y + z = 2, and the 3 planes x = 1, y 1, 2=1. 2. Cyl i ndri cal Coordi nates 419 ( l l dx dy dz 10. Find / / / t|-----; rz over the domain in the first octant between the planes J J J (x+y + z)2 x + y + z = 1and x + y + z = 4. [Hint: Think!] Sketch the domain: 11. x2 + y2< 1, y + x2 > 0 12. x2 + y2 < 1, x2 < y < x2 13. x2+ y2 > 1, (X - 2)2+ y2 < 9 14. x > 3, y < 5, y x > 10 15. x + y < 0, xy < 1, (x y) 2 < 1 16. (x + y)2<1, (x y)2< 1. Express the double integral J J f ( x, y) dx dy over the specified domain as the sum of one or more iterated integrals in which y is the first variable integrated: 17. x2+ y2< 1, x2+ (y - 1)2< 1 18. y > (x + 1)2, y + 2x < 3 19. x > 0, 0 < y < t , x < sin y 20. x > 0, x2y2> 1, x2+ y2< 9. 21. Describe the domain D:0< x< y< z < 1. Iterate III p (x, y, z) dx dy dz in the I I I ' orders x, y, z; z, y, x; and x, z, y. D 22. Repeat Ex. 21 for a < x < y < z < 6. 23. Repeat Ex. 21 for 0 < x < 2y < 3z < 6. 24. Express a triple integral over the domain determined by (x 1)2+ y2< 4, 0 < z < y, (x + 1)2+ y2< 4 as one iterated integral (not a sum of two or more). Sketch the domain. 25. Find the integral of x over the tetrahedron with vertices (5,6,3), (4,6,3), (5,5,3), (5,6,2). 26. Set up the integral of 5(x, y, z) over the tetrahedron with vertices (5, 5, 1), (5,-5,- 2), (5, 1,1), (-2,- 5,1). 27. Express g (x) dxj dyj dz as a simple integral. 28. Take four vertices of a unit cube, no two adjacent. Find the volume of the tetra hedron with these points as vertices. 29. (cont.) Now take the tetrahedron whose vertices are the remaining four vertices of the cube. The two tetrahedra intersect in a certain polyhedron. Describe it and find its volume. \ 2. CYLINDRICAL COORDINATES Cylindrical coordinates are designed to fit situations with rotational (axial) symmetry about an axis. The cylindrical coordinates of a point x = ( x , y t z) are {r, 6, z], where 420 U . MULTIPLE INTEGRALS [ry6) are the polar coordinates of (xyy) and z is the third rectangular co ordinate (Fig. 2.1a). Each surface r = constant is a right circular cylinder, hence the name, cylindrical coordinates (Fig. 2.1b). (a) (b) Fi g . 2.1 Through each point x (not on the 2-axis) pass three surfaces, r = constant, 6 = constant, z = constant (Fig. 2.2). Each is orthogonal (perpendicular) to the other two at their common intersection x. Fi g . 2.2 2. Cyl i ndri cal Coordinates 421 The relations between the rectangular coordinates (x, y, z) and the cylindrical coordinates {r, 0, z) of a point are x = r cos i y = r sin 6 z = z r2 = x2 + y2 z = z. The origin in the plane is given in polar coordinates by r = 0; the angle 0 is undefined. Similarly, a point on the 2-axis is given in cylindrical coordinates by r = 0, z constant; 0is undefined. EXAMPLE 2.1 Graph the surfaces (i) z 2r, (ii) z r2. Solution: Both are surfaces of revolution about the 2-axis, as is any surface z = f ( r ) . Since z depends only on r, not on 6, the height of the surface is constant above each circle r = c in the x, y-plane. Thus the level curves are circles in the x, y-plane centered at the origin. I n (i), the surface meets the first quadrant of the y, 2-plane in the line z = 2y. (In the first quadrant of the y, 2-plane, x = 0 and y > 0. Since r2 = x2 + y2 = y2, it follows that r = y. ) Rotated about the 2-axis, this line spans a cone with apex at 0. See Fig. 2.3a. (a) Fi g . 2.3 In (ii), the surface meets the y, 2-plane in the parabola z = y2. Rotated about the 2-axis, this parabola generates a paraboloid of revolution (Fig. 2.3b). 422 U . MULTIPLE INTEGRALS EXAMPLE 2.2 Find the level surfaces of the function /(r, 0, z) = r. Solution: Each surface is defined by r = c. This is a right circular cylinder whose axis is the 2-axis (Fig. 2.4). Answer: Concentric right circular cylinders about the 2-axis. The Nat ural Frame I t is convenient to fit a frame of three mutually perpendicular vectors to cylindrical coordinates just as the frame i, j, k fits rectangular coordinates. At each point {r, 6, z } of space attach three mutually perpendicular unit Fi g . 2.5 \ 2. Cyl i ndri cal Coordi nates 423 vectors u, w, k chosen so u ( \ r - w points in the direction of increasing d k wz j Thus (Fig. 2.5) The vectors u, w, k form a right-hand system: u X w = k, w X k = u, k X u = w. Note that u and w depend on d alone, while k is a constant vector, our old friend from the trio i, j , k. Note also dU aw = W, = u. dd dd I n situations with axial symmetry, it is frequently better to express vectors in terms of u, w, k rather than i, j , k. Integrals If a solid has axial symmetry, it is often convenient to place the 2-axis on the axis of symmetry, and use cylindrical coordinates {r, d, z }for the computa tion of integrals. I n polar coordinates {r, 0}, the element of area is r dr dd. Hence the element of volume in cylindrical coordinates {r, 6, z }is dV = r dr dd dz. To gain further insight, let us argue intuitively for a moment. Given x = {r, d, z }, how is this point displaced if we give small increments dr, dd, dz to its three coordinates? Figure 2.6 shows that the displacement in the direction of u is dr u. The displacements in the directions of w and k are r dd w and dz k. The displacement of x is approximately the sum of these three small vectors. Accordingly we write 424 U . MULTIPLE INTEGRALS dx = dr u + r dd w + dz k. Since the displacements are mutually perpendicular vectors, they span a small rectangular box of volume (dr) (r dd) ( dz) . Thus we arrive again at the formula dV = r dr dd dz. EXAMPLE 2.3 Evaluate j j j (x2 + y2) ll2z dx dy dz} taken over the cone with apex (0, 0, 1) and semicircular base bounded by the x-axis and y = \/4 x2. Solution: Write the integral in cylindrical coordinates: / = J J J (rz)r dr dO dz. The lateral surface of the cone (Fig. 2.7) is given by 2. Cyl i ndri cal Coordi nates 425 1 2 = 1 2 r> so the solid is described by O<0<T , 0 < r < 2, 0 < z < 1 - ^r. Therefore ' - r ( f *)* - ' H O r h = / r2( 1r + - r2) dr = ^ / (r44r3+ 4r2) dr = . 2 Jo \ 4 / 8 y0 15 Answer: , 15 EXAMPLE 2.4 A region D in space is generated by revolving the plane region bounded by 2 = 2x2, the x-axis, and #= 1 about the 2-axis. Mass is distributed in D so that the density at each point is pro portional to the distance of the point from the plane 2 = 1, and to the square of the distance of the point from the 2-axis. Compute the total mass. Solution: The density is 8 = k(x2 + y2) (2 + 1) = kr2(z +1) , where k is a constant. The portion of the solid in the first octant is shown in Fig. 2.8. I n cylindrical coordinates, the solid is described by the inequalities O<0<2tt, 0 < r < 1, 0 < 2 < 2r2. Therefore its mass is 426 11. MULTIPLE INTEGRALS EXERCISES 1. Find the rectangular coordinates of {2, 57t/4, 3}, {l, w, 2}. 2. Find the cylindrical coordinates of (0, 2, 2), (5, 5, 0), (1, \/3, !) 3. A space curve is given by r = r(0, 0= 0(O>2==z(0* Show that its arc length satisfies s = (r2+ r202+ 2)1/2. 4. (cont.) Find the length of the spiral r = a, 6 = bt, z = ct for 0 < t < 1. 5. Prove analytically the formula dx = dr u + r dB w + dz k. 6. Prove ~du~ dw = _dk_ 0 L 0 dd O' 0 0 0 0. u w Lk. 7*. Letf = f ( r, 0, z) . Prove grad/. 2 u + I | w + ^ k. dr r dd dz [Hint: Use Ex. 5.] 3. Spheri cal Coordi nates 427 8. Use cylindrical coordinates to find the volume of a right circular cone of base radius a and height h. 9. Find JJJ xyz dx dy dz over the quarter cylinder 0 < r < a, 0 < 0< ^tt, 0 < z < h. 10. Find / / / 'dx dy dz over the domain common to the sphere x2 + y2 + z2 < a2 and the cylinder r < b, where b < a. 11. Find ( I L dx dy dz over the cylindrical wedge 0 < z < y, r < a, 0 < 0 < 7r. 12. Sketch the domain r2 < z < 2r2, r < 1. Now find / / / zdx dy dz over this domain. 13. Sketch the domain \z\ < r2, r < a, x > 0, y > 0. Now find dx dy dz over the domain. 14*. Sketch the domain 0 < r < cos 20, ^7r < 0< 71*, 0 < 2 < 1r2. Now find /// fz dx dy dz. (Use the definite integral table inside the front cover to complete the solution.) 3. SPHERICAL COORDINATES Spherical coordinates are designed to fit situations with central sym metry. The spherical coordinates [p, 0, 0] of a point x are its distance p = |x| from the origin, its elevation angle 0, and its azimuth angle 0. (Often 0 is called the longitude and 0 the co-latitude.) See Fig. 3.1. F i g . 3.1 x 428 U . MULTIPLE INTEGRALS Relations between the rectangular coordinates (x, y, z) of a point and its spherical coordinates may be read from Fig. 3.2. They are Note that 0is not determined on the 2-axis, so points of this axis are usually avoided. I n general 0 is determined up to a multiple of 2t , and 0 < 0 < r. The level surfaces p = constant concentric spheres about 0 j 0 = constant are right circular cones, apex 0 * 0 = constant y ^planes through the 2-axis At each point x the three level surfaces intersect orthogonally (Fig. 3.3). The Natural Frame Select unit vectors \, [i, v at each point x of space (not on the 2-axis) such that P points in the direction of increasing 0 > V ^0 y See Fig. 3.4. Then \ , |x, v is a right-hand system. Our immediate problem is to express \, |A, v in terms of p, 0, 0. Here is a short cut for doing so: x = p (sin 0 cos 0, sin 0 sin 0, cos 0), dx = (sin 0 cos 0, sin 0 sin 0, cos 0) dp + p (cos 0 cos 0, cos 0 sin 0, sin 0) d<t> + p(sin 0 sin 6, sin 0 cos 0, 0) dd = \ d p + |i p d(j> + v p sin 0 dd. I t is easy to check that \i, v are unit vectors. Furthermore, they point in the directions of increasing p, 0, 0respectively. Precisely, if only p increases, then d<t> = dd = 0, hence dx = \ dp. Similarly, if only 0 increases, thendx = |x p d0, 3. Spheri cal Coordi nates 429 Fi g . 3.3 430 U . MULTIPLE INTEGRALS Fi g . 3.4 and if only 6 increases, then dx = vp sin 0 dd. Conclusion: \ = (sin 4> cos 0, sin <sin 0, cos <t>) \ [x (cos <f>cos 0, cos 0 sin 0, sin <) { v = (sin 0, cos 0, 0), dx = \ dp + |x p d<t> + v p sin <t>dd. 3. Spheri cal Coordi nates 431 Integrals If a solid has central symmetry, it is often convenient to place the origin at the center of symmetry and use spherical coordinates [p, 0, 0] for the computation of integrals. The displacement of a point of space, in terms of the frame \, |x, v natural to spherical coordinates, is dx = dp \ + p d<t> |i + p sin 0 dd v. See Fig. 3.5. The element of volume of the small rectangular solid is the product of its sides: dV = (dp) (p d<l>) (p sin 0 dd) = p2sin 0 dp d<t>dd. EXAMPLE 3.1 Use spherical coordinates to find the volume of a sphere of radius a. Solution: V !sin 0 dp dcf) dd / / / ' Answer: - %a?. O EXAMPLE 3.2 Find the volume of the portion of the unit sphere which lies in the right circular cone having its apex at the origin and making angle a with the positive 2-axis. Solution: The cone is specified by 0 < 0 < a, so the portion of the sphere is determined by 0 < 0 < 2ir, 0 < 0 < a, and 0 < p < 1. See Fig. 3.6. Hence the volume is ( / d 6 ) d o sin ^ ^ ) ( / p2rfp) = ^ - cs Q ) 432 U . MULTIPLE INTEGRALS Remark : As a check, let a --------> 7r. Then the volume should approach the volume of a sphere of radius 1. Does it? EXAMPLE 3.3 A solid fills the region between concentric spheres of radii a and b} where 0 < a < b. The density at each point is inversely propor tional to its distance from the center. Find the total mass. Solution: The solid is specified by a < p < b; the density is 8 = k/p. Hence M = J J J - p2sin </>dp d<t>dd = k ( J * * d e ) ( J * sin * P dp) = (27rk) (2) Answer: 2 irk(b2 a2). R e ma r k : As a --------> 0, the solid tends to the whole sphere, with infinite density at the center. But M -------->2irkb2, which is finite. EXAMPLE 3.4 A cylindrical hole of radius J is bored through a sphere of radius 1. The surface of the hole passes through the center of the sphere. How much material is removed? Solution: Center the sphere at 0 and let the cylinder (hole) be parallel to the z-axis, with axis through (0, J ). See Fig. 3.7. By symmetry the volume 3. Spheri cal Coordi nates 433 Fi g . 3.7 is four times that in the first octant. The equation of the cylindrical surface is x2 + (y - J ) 2 = I, or x2 + y2 = y. But in spherical coordinates, x2 + y2 = p2 sin2 0 and y = p sin 0 sin 0. Hence the equation of the cylinder is p2 sin2 0 = p sin 0 sin 0, or sin 0 sm 0 The cylinder intersects the sphere in the curve (Fig. 3.8) i sinfl i P = T = 1. sm 0 I n the first octant this is the curve p = 1, 0 = 0. The first-octant portion of the volume naturally splits into two parts, separated by the cone 0 = 6 of segments joining 0 to the curve of intersection. I n the lower part, each radius from the origin ends at the cylinder, so the limits are 7r 7r sin 6 O <0 <~, 0 <0 <- , 0 <p < . 2 2 sm 0 I n the upper part, each radius from the origin ends on the sphere, so the 434 U . MULTIPLE INTEGRALS limits are O < 0 < - , 0 < 4 < e , 0 < p < 1. Li Thus V = 4( F i +F 2), where V\ is the volume of the lower part and V2 that of the upper. Now r tt/2 nr/2 Tsin0/sin <f> ^ /'7r/2 /'7r/2 V i = dO sin <t>d<t) p2dp = - sin30dd / . o- Jo A Jo 3J 0 Je sm20 ! /*T/2 . 1 /^/2 . 1 = - / sm30cot 0 dO = - / sm20 cos 0 dO = - , 3 Jo 3 70 9 /"V2 re ri j /-ir/2 j / \ F2= J dO J sin 4)d<t) J p2dp = - J (1 cos 0) d0 = - ly . ^ = 4 ^- +- - - ^ = \9 6 3/ 3 9 and Therefore Answer; 2tt __ 8 3 9 Spherical Ar ea Suppose a point moves on the surface of the sphere p = a. Then dp = 0, so the formula for displacement specializes to dx = a d<t> |x + a sin <f>dOv. The area of the small rectangular region on the surface of the sphere corresponding to changes d<t> and dd is dA = (a d<f>) (a sin <j>dd) = a2sin <d<j) dd. See Fig. 3.9. The integral of this expression over a region is the area of that region. 3. Spheri cal Coordi nates 435 EXAMPLE 3.5 Find the area of the polar cap, all points of co-latitude a or less on the unit sphere. Solution: See Fig. 3.6. The region is defined on the sphere p = 1 by 0 < </>< oi, hence A = J ^ j sin <t)d<^jdd = 2tt(1 cos a). Answer: 2tt(1 cos a). R e m a r k : Suppose S is a region on the unit sphere. The totality of infinite rays starting at 0 and passing through points of S is a cone which is called a solid angle (Fig. 3.10). A solid angle is measured by the area of the base region S. Since S lies on the unit sphere, its area is measured in square radians, a dimensionless unit. Thus the solid angle of the whole sphere has 436 11. MULTIPLE INTEGRALS z y Fi g . 3.10 measure 47r rad2, or simply 47r. The solid angle determined by the first octant has measure 47r/8 = t t / 2. The solid angle of the polar cap in Example 3.5 has measure 2t (1 cos a). Incidentally, in the metric ( SI) system, the unit for solid angles is the steradian (sr). EXERCISES 1. Convert to rectangular coordinates: [1, 3tt/4, 37t/4], [3, t / 2, 5t / 4], [2, 27r/3, 7t/3] 2. Convert to spherical coordinates: (1, 1, 1), (1, - 1, 1), ( - 1, - 1, - 1), ( - 1, 1, - 1). 3. If x has spherical coordinates [p, 0, 0], find the spherical coordinates of x and of 3x. 4. Suppose a space curve is given byp = p (t), <t>= <t>(t),d = 0(t). Show that its arc length satisfies s2 = p2+ p202+ p2(sin2<t>)62. 5. (cont.) A rhumb line on a sphere of radius a is a curve that intersects each meridian at the same angle a. (Follow a constant compass setting.) Find the length of a rhumb line from the equator to the north pole. 6*. Let / = f(r, <, 6). Prove 1 , , df . . l d f , grad/ + , .a dp pd(j) p sin <f>dd df V. 7. Prove "d \ ~ d[L = - d V - 0 d<f) cUf) 0 V LV J _sin <l>dd cos <t>dd 0 8. Let v be a radial vector field of the form v = g( p)\ . Prove that v is a gradient field. [Hint: Use Ex. 6.] 9. Suppose /[p,$, 0] is homogeneous of degree n with respect to rectangular co
ordinates, that is, f(tx) = tnf (x) for all t > 0. Prove/[p, <t>, 0] = pn^(</>, 0), where
g(4>, 0)=/[l,<M ].
10. Find / / / 'ndx dy dz over the sphere p < a, where n > 0.
11. Find J J J (1p)ndx dy dz over the sphere p < 1, where n > 0.
Find J j j 1 dx dy dz over the first octant portion of the sphere p < a.
.3. F i nd/// p"~2dx dy dz over the domain z > 0, a < p < b, where 0 < a < 6.
14. Find /// dx dy dz over the domain 1< z, p < f \/3.
15. Two parallel planes at distance /i intersect a sphere of radius a. Find the surface
area of the spherical zone between the planes.
16. In Example 3.4, how much of the surface area of the sphere is removed?
Addi ti onal Vol ume Problems
Use any method you please to set up the integral. You may not be able to evaluate it,
but at least try to reduce the answer to a simple integral, not a double or triple one.
17. Find the volume of the (solid) right circular torus obtained by revolving the circle
(y A )2+ z2= a2, x = 0 about the z-axis. Assume 0 < a < A.
18. Suppose in Ex. 17 that 0 < A < a. Let the portion of the circle to the right of the
z-axis generate volume V\ and the portion to the left generate volume Vi. Find
V i V 2.
19*. A sphere of radius a touches the sides of a cone of semi-apex angle a. See Fig. 3.11.
Find the volume of the portion of the cone above the sphere.
3. Spheri cal Coordi nates 437
front view side view
Fi g . 3.11 (Ex. 19)
438 U . MULTIPLE INTEGRALS
20*. A circular hole is bored through a right circular cone, dimensions as indicated in
Fig. 3.12. The axes are perpendicular and the hole just fits. How much material is
removed?
21*. (cont.) How much of the cone remains above the hole?
front view side view
F i g . 3.12 (Exs. 20, 21)
F i g . 3.13 (Ex. 25)
3. Spheri cal Coordi nates 439
F i g . 3.15 (Ex. 27)
front side
440 U . MULTIPLE INTEGRALS
22*. A cylindrical hole of radius a is bored through a solid cylinder of radius 2a; the
hole is perpendicular to the solid cylinder and just touches a generator. Find the
volume removed.
23. A circular hole of radius a is bored through a solid right circular cylinder of
radius b. Assume the axis of the hole intersects the axis of the solid at a right angle
and that a < 6. Show that the volume of material removed is
/
a _____ _____ f r/2 _________
y/ aLx2\ / b z x2dx = 8a2 / (sin20)\/&2a2cos26 dB.
24. Do Ex. 23 assuming that the axes meet obliquely at angle a. Set up, but do not
evaluate the integral. Can you evaluate it when a = b?
25. A square hole is bored through a right circular cylinder (Fig. 3.13). How much
material is removed?
26*. A square hole is bored through a right circular cylinder (Fig. 3.14). How much
material is removed?
27*. A square hole is bored through a right circular cone (Fig. 3.15). How much
material is removed? (Do not evaluate the integral.)
4. CENTER OF GRAVITY
Suppose a solid D has density 8 (x) at each point x. Then its mass is
M = J J J 8 (x) dx dy dz.
D
Define its moment about the origin
m=i i i ^y dz>
and its center of gravity
1
x = m.
M
Note that m and x are vectors. I t is often convenient to express them in terms
of components:
m = (mx, my, ?nz), x = (x, y, z).
The center of gravity may be considered as a sort of weighted average of the
points of the solid. Recall in this connection that the center of gravity of a
system of point-masses Mi, , Mn located at Xi, , xnis
x = (MiXi + M2x2+ + M nxn),
where M = Mi + + M n.
I f a solid is symmetric in a coordinate plane, then the center of gravity
lies on that coordinate plane. For example, suppose D is symmetric in the
x , y-plane. This means that whenever a point (x, y , z ) is in the solid, then
4. Cent er of Gravi ty 441
(,xj y, z) is in the solid, and 8(x, y, z) = 8(x} y, z ). The contribution to
mz at (x, y, z) is
8 (x, 2/, 2) 2 dx dy dz;
it is cancelled by the contribution
Vi z) (z ) dx dy dz = 8(x, y , z ) z d x dy dz
at (x, y, z). Hence m2= 0 and 2 = 0.
Similarly, if D is symmetric in a coordinate axis, then the center of gravity
lies on that axis. Finally, if D is symmetric in the origin, then x = 0.
To compute the center of gravity of a solid, exploit any symmetry it has
by choosing an appropriate coordinate system. Of course, express the element
of volume dV = dxdy dz in the coordinate system chosen.
EXAMPLE 4.1
Find the center of gravity of a uniform hemisphere of radius a
and mass M.
(a) (b)
Fi g . 4.1
Solution: To exploit symmetry, choose spherical coordinates, with the
hemisphere defined by p < a and 0 < << t / 2 . See Fig. 4.1a. The density 8 is
constant, so the mass is 8 times the volume:
1/4 \ 2
M = I 7ra3) 8 = 7ra35.
2 \3 / 3
Because the hemisphere is symmetric in the y, 2-plane, mx = 0. Likewise
rny = 0; only mz requires computation:
442 U . MULTIPLE INTEGRALS
Hence
vi z 7r5a4/4 3
2 = H = 2x5a3/3 = 8 a'
Answer; The center of gravity lies on the axis
of the hemisphere, f of the distance from the
equatorial plane to the pole (Fig. 4.1b).
R e m a r k : The answer is independent of 8. For uniform solids in general,
the constant 8 cancels when you divide m by M , so x is a purely geo
metric quantity. I t is then called the center of gravity or centroid of the
geometric region (rather than the material solid). For uniform solids, from
now on take 5 = 1and M = V, the volume.
EXAMPLE 4.2
Find the center of gravity of a uniform right circular cone.
(a) (b)
Fi g . 4.2
Solution: Choose cylindrical coordinates with the apex of the cone at
0 and the base of radius a centered at (0, 0, h). See Fig. 4.2a. The lateral
surface of the cone is
h a
z = - r, that is, r - z,
a h
and the volume of the cone is
By symmetry mx = my = 0. Compute mz\
mz = J J J zr dr dd dz = ^ [J* ( j '* r dr'j z dzj dd
4. Cent er of Gravi ty 443
Hence
f , ' 1* I ! l ( i :' ) ' 42 - (2' > (! ) I ? ' * -
mz Td2h2/4: 3
7Ta 2 h 4 7T
TT'T = 7 fl2/i2.
h2 4 4
h.
V 7ra2h/ 3 4
Answer: The center of gravity is on
the cones axis, J of the distance from
the base to the apex (Fig. 4.2b).
EXAMPLE 4.3
The solid 0 < a; < 1, 0<y <2, 0 < 2 < 3 has density xyz
gm/cm3. Find its center of gravity.
Solution:
M = j j j xyz dx dy dz = j^ x dx j y dy j z d z = ^Sm*
m = j j j (x, y, z) xyz dx dy dz
- ( I l l x'yzdxdydz, J J J x f z dx dy dz, J J J xyz' dxdydz)
O
fi r 2 rz r l /*2 rz
x2dx / y dy I zdz, / x d x y2dy I z dz,
0 ^0 /o /0 */0 /()
f 1xdx f 2y dy [ z* dz)
J o Jo J o /
/I 4 9 1 8 9 1 4 27\ 4-9 nx
= ( 3 2 2 2*3*2 2*2"1T/ = 3*2*2 = gm-cm.
Hence
2 /2 4 \
= 9 (3) (1, 2 3) = \3 3 /
cm.
Answer: x
e - i - o
cm.
444 U . MULTIPLE INTEGRALS
The definitions of moment and center of gravity for solids can be modified
in an obvious way to fit plane regions with a mass distribution.
Plane Regions
EXAMPLE 4.4
Find the center of gravity of a uniform semicircular disk.
Fi g . 4.3
Solution: Choose polar coordinates, taking the disk in the position
0 < r < a, O<0<7r . See Fig. 4.3a. Take 5 = 1; the mass equals the area
A = - (tta2).
By symmetry mx = 0. Compute my:
my = j j y r dr dd = j j r2 sin 6 dr dd = j sin 6 dd j r2dr = - a3.
Hence
- i - A (o, | )
Answer: x See Fig. 4.3b.
There is a useful connection between the centers of gravity of plane regions
and volumes of revolution.
First Pappus Theorem Suppose a region D in the x, y-plane, to the right
of the y-axis, is revolved about the y-axis. Then the volume of the resulting
solid is
where A is the area of the plane region D and x is the ^-coordinate of its
center of gravity (Fig. 4.4).
V = 2trxA,
4. Cent er of Gravi ty 445
I n words, the volume is the area times the length of the circle traced by the
center of gravity. Proof: A small portion dx dy at x revolves into a thin ring of
volume
dV = 2 t t x dx dy.
Hence
- / /
But mx = xA, hence V = 2 t t x A .
2 t t x dx dy = 2 t mx
Wires
A non-homogeneous wire is described by its position, a space curve x = x (s)
where a < s < b, and its density 8 = 8(s). (Here s denotes arc length.) I ts
mass is
its moment is
m
8 (s) ds,
x( s ) 8( s ) ds,
and its center of gravity is
If the wire is uniform, then 8(s) is a constant. I n this case, the center of
gravity is independent of 8, hence it is a property of the curve x = x (s) alone;
you can take <5= 1and replace M by L, the length.
EXAMPLE 4.5
Find the center of gravity of the uniform semicircle r = a, y > 0.
m = J xds = J (a cos d, a sin d) a dd
446 U . MULTIPLE INTEGRALS
Solution: The length is L = ira. The moment is
. /
Jo
(cos 6, sin0) dd = a2(0, 2).
Hence
x = ^ m = a2(0, 2) = - (0, 2).
L ira 7r
-Answer
Suppose a plane curve is revolved about an axis in its plane, generating
a surface of revolution. There is a useful relation between the center of gravity
of the curve and the area of the surface.
Second Pappus Theorem Suppose a curve in the x, y-plane to the right
of the y-axis is revolved about the y-axis. Then the area of the resulting
surface is
A = 2t t xLj
where L is the length of the curve and x is the ^-coordinate of the center of
gravity (Fig. 4.5).
In words, the area is the length of the curve times the length of the circle
traced by the center of gravity. Proof: A short segment of length ds of the
curve at the point x(s) revolves into the frustum of a cone with lateral area
4. Cent er of Gravi ty 447
rb rb
dA = 2t x ds. See Fig. 4.6. Hence
r b rb
A ) 2irx ds = 2t I x ds = 2Trmx = 2irxL.
J a J a
There is a useful aid to problem solving which we state for solids. With
obvious modifications, it applies to wires or laminas.
radius x
Addi ti on Law Suppose a solid D of mass M and center of gravity x is cut
into two pieces D0 and Di, of masses M0 and Mh and centers of gravity x0
and Xi. Then
M = M o + M i , x ( M q Xq + M i X i ) .
The first formula is obvious. The second is simply a decomposition of the
moment integral:
Mx = j j j 5(x) x dV = j j j + J j j = M0X0 + M,X1.
Do Di
EXERCISES
1. Find the center of gravity of the first octant portion of the uniform sphere p < a.
2. Suppose the density of the hemisphere p<a, z>0i s5 = ap. Find the center
of gravity.
3. Find the center of gravity of the uniform spherical cone p<a, 0 <</>< a.
4. Find the center of gravity of the uniform hemispherical shell a < p < b, z > 0.
5. Find the center of gravity of a uniform circular wedge (sector).
6. The plane region bounded by z = 1and the parabola z = y2 is revolved about
the z-axis. Find the center of gravity of the resulting uniform solid.
7. Find the center of gravity of a uniform wire in the shape of a quarter circle.
448 U . MULTIPLE INTEGRALS
8. Find the center of gravity of the uniform spiral x(t) = (a cos t, a sin t, bt),
0 < t < t 0.
9. A copper wire in the shape of a semicircle of radius 100 cmis steadily tapered
from 0.1 to 0.5 cmin diameter. Find its center of gravity.
10. Verify the First Pappus Theorem for a semicircle revolved about its diameter.
11. Use the First Pappus Theorem to find the volume of a right circular torus.
12. Verify the First Pappus Theorem for a right triangle revolved about a leg.
13. Use the Second Pappus Theorem to find the surface area of a right circular torus.
14. Use the Second Pappus Theorem to obtain another solution of Example 4.5.
15. Find the center of gravity of the uniform spherical cap (surface)
p = a, 0 < 0 < a .
16. Find the center of gravity of the uniform spherical cap (solid)
p<a, a h < z < a.
17. Suppose a solid of density 5(x) is acted upon by a uniform (constant) gravita
tional field f so the force on a small portion at x is [5 (x) r f 7] f . Show that the solid
is in equilibrium if a single force Mf is applied at x.
18. Find the center of gravity of the uniform triangle with vertices a, b, c.
19. Find the center of gravity of the uniform tetrahedron with vertices a, b, c, d.
Recall that the kinetic energy of a moving particle of mass m and speed
v is K = \ mvl. The kinetic energy of a system of moving particles is the sum
of their individual kinetic energies. To define the kinetic energy of a moving
solid body D, decompose the body into elementary masses 6(x) dV, where
5. MOMENTS OF INERTIA
z
y / x 1 + y 2
x
Fi g . 5.1
5. Moment s of Inerti a
449
K = l j f j 5(x) lv(x- ONF.
D
Here v(x, t) is the velocity of the point x of the body at time t. Thus K varies
with time.
We shall compute the kinetic energy of a rigid body rotating with angular
velocity <oabout an axis through the origin.
First consider rotation about the 2-axis with angular speed co, so the
angular velocity is co= cok. See Fig. 5.1. An elementary volume dV = dx dy dz
at x has mass dM = 8(x) dV and speed co\/x2 + y2, since y / x2 + y2 is the
distance from x to the 2-axis. Hence its kinetic energy is
1
5(x) is the density at x, and form the integral
dK = - ( u y / x 2 + y2) 28(x) dV = - u 28 ( x ) ( x 2 + y2) dV.
z z
To find the kinetic energy of the entire solid D, integrate, obtaining
1
K = - / 2*co2,
where
m
8 ( x) ( x2 + y2) dV
is the moment of inertia of D about the 2-axis.
Similarly define
I I I
8 ( x) ( y 2 + z2) dV and
- I f f
8( x) (z2 + x2) dV.
From these moments of inertia one can compute the kinetic energy,
provided the body rotates about one of the coordinate axes:
1
K = - I x^ (rotation about the z-axis),
K = \ l (rotation about the y-axis).
Products of Inert i a
For rotation about more general axes, however, three other quantities
called products of inertia (also mixed moments of inertia) are needed:
8( x) yz dV, I 2X = I xz = J J
D
= - J J J S( x) xydV.
8 ( x ) z x dV,
Ixylyx
450 U . MULTIPLE INTEGRALS
The three moments and three products of inertia together form the sym
metric matrix of inertia
I XX I x y I x z
I =
I y x l y y Iy.
J
I zy I z z .
The quadratic form associated with this matrix gives the kinetic energy of a
solid rotating about an axis through 0.
Suppose D rotates about an axis through 0 with angular velocity o>=
(co*, co, a)z) . Then its kinetic energy is
I XX I x y N
1
C0X
K = - co/co' = - (cOx, J U y , U > z )
A z
I yx
l y y l y z
CCy
J ~
I zy
I z z -
_ 0 > Z _
Proof: Let u be a unit vector along the axis of rotation. Then a> = coil,
where co2= <o*a>= wx2+ coy2+ coz2. If x 6 D, the projection of x on the axis
of rotation is x* u. Therefore the distance of x from the axis is [|x|2(x* u )2] 1/2.
See Fig. 5.2. Hence the speed at x is w[|x|2(x* u )2] 1/2, and the kinetic energy
5. Moment s of Inerti a 451
of an elementary mass at x is
clK = ^5(x)co2[j x| 2 (x*u)2] dV = ^5(x)[co2 |x|2 (x*<o)2] dV.
Z A
But
CO2 |x|2 (x* CO)2 = CO
> |2
0 0 X2 xy xz
0 |x|2 0 (O' - (0yx
yi
yz
_ 0 0
|x|2_
_zx
zy
z2_
CO
hence
dK = - ( x )<d
-xz
CO' d V.
y2 + z2 xy
yx z2 + x2 yz
zx zy x2 + y2_
The formula for K follows by integrating over D.
Appli cati ons
Geometric symmetries simplify calculation of the moments and products
of inertia. For example, suppose that a solid is symmetric in the x, y-plane.
Recall this means that whenever a point (x, y , z) is in the solid, then so is
(x, y, z), and 5(z, y , z) = 8(x, y } z). Then I lx = 0, because the contri
bution
y, z) zx dV
at each point (x, y, z) above the x, y-plane is cancelled by the contribution
f>(x> Vi ~z ) ( ~z ) (x) dV = 8(x, y , z) zx dV
at the symmetric point (x, y, z) below the x> y-plane. Likewise Iy, = 0
under the same symmetry condition. For example, a uniform right circular
cone with axis the 2-axis has all products of inertia 0, since it is symmetric
in two coordinate planes.
EXAMPLE 5.1
Compute the moments and products of inertia for a uniform
sphere of radius a and mass M with center at the origin. Measure
length in cm and mass in gm.
Solution: Let 8 denote the constant density; then M = 47ra35/3 gm.
Because the sphere is symmetric in each coordinate plane, all products of
inertia are 0.
By symmetry, I xx = I yy = I zz. I t appears most natural to use spherical
coordinates and compute I zz:
I zz = 8 J JJ ( x2 + y 2) = 6 (p2sin2 0 cos2 0 + p2sin20 sin20) dF
(p2sin20) p2sin 0 dp d<f) dd
452 U . MULTIPLE INTEGRALS
- // /
= 6 / p4dp f sin30 d0 f
J o /o */o
sin3<t>d<t> I * I Ma*.
15 5
Answer: I** = = Izx = 0,
iza; = I y y = / = T M<Z2 gm-CHl2.
5
EXAMPLE 5.2
Find the products of inertia of the first octant portion of the
sphere of Example 5.1.
Solution: By symmetry I xy = I yz = I zx. Choose spherical coordinates
and compute that product of inertia whose formula seems the most symmetric;
this is Ixyj since the 2-axis is special in spherical coordinates:
Ixy = 8J J J xy dx dy dz
= 8 J J J (p2sin20 cos 6 sin 6) p2sin 0 dp d<j>dd
r a ri c/ 2 ri r/ 2
= 8 p4dp sin30d0 / cos 0sin 0
./0 ^0 ^0
But
M = - ( - Tra?8 ) = - t t a35.
8 \3 / 6
Hence I xy = 2Ma2/5ir.
Answer: I xy = I yz = I zx
5w
Ma2gm-cm2.
5. Moments of Inertia 453
N o t e o n u n i t s : The unit of work, or energy, in the CGS system is
1erg = 1dyne-cm. Remember 1dyne = 1gm-cm/sec2.
EXAMPLE 5.3
The solid of Example 5.2 rotates with angular velocity
= (<o/\/3) (1, 1, 1). Find its kinetic energy.
Solution: By symmetry, the moments of inertia are f those of the full
sphere, and the mass M is | that of the full sphere. By Example 5.1
2
Ixx = I yy ~ I zz = ~ M a 2,
5
where M denotes the mass of the first octant portion of the sphere. Since
cox = o)y = co2= co/\/3, the formula for kinetic energy yields
K = \ (3J + 6/)
- K i X ? " -
Paral l el Axi s Theorem
Take an axis 0 anywhere in space (not necessarily through the origin).
Suppose a rigid body D rotates about this axis with angular speed co. See
Fig. 5.3. The speed at each point x is coBx, where Bx is the distance from x
to the axis 0. Hence the kinetic energy is
K = - Ip co2, where Ip = J J J Bx28(x) dV.
D
This defines Ip, the moment of inertia of D about the axis (3. The Parallel Axis
Theorem allows us to compute Ip in terms of the moment of inertia about a
parallel axis through the center of gravity.
Paral l el Axi s Theorem I f a is an axis through the center of gravity of
D, and (3 is an axis parallel to a, then
Ip = I a + Md2,
where d is the distance between the axes and M is the mass of D.
^ Mo2] = i (1 - MaW.
57T / 5 \ 7T/
Answer: K = ~ Ma2co2erg.
454 U. MULTIPLE INTEGRALS
The proof of this result in the general case is not hard, but involves some
technicalities. So let us content ourselves with the special case in which
x = 0 and a is the 2-axis. Suppose ft passes through (a, b, 0). See Fig. 5.4.
Fi g . 5.4
For each point x of D, the distance Ax of x from a is given by
A x* = X1 + y1 = I (x, y ) I2,
5. Moments of Inerti a 455
and the distance Bx of x from 0 is given by
Bx2 = (x - a)2+ (y - b) 2 = | (x, y) - (a, 6)|2
= IO, y) |2 - 2(o, b) (x, 2/) + I (a, 6)| 2
= Ax2 2 (a, 6) (s, 2/) + d2.
Multiply by the element of mass, 5(x) dV, and integrate. Result:
Ip = I a 2 (a, b) (mx, my) + Md2.
But mx = my = 0 since x = 0. Hence Ip = I a + Md2.
EXAMPLE 5.6
Find the moment of inertia of a uniform sphere of radius a and mass
M about an axis tangent to the sphere.
Solution: From Example 5.1, the moment of inertia about any axis
through the center (e.g.) is 2Ma2/ 5. The distance from a tangent axis /3 to
the center is a, so the Parallel Axis Theorem implies
2 7
Ip = - Ma2 + Ma2 = - Ma2.
5 5
Ansiver: ~ Ma2.
5
EXERCISES
1. A uniform cylinder of mass M gm occupies the region h < z < /&, x2+1/2< a2,
distance in cm. Find its moments and products of inertia. [Hint: Use cylindrical
coordinates.]
2. (cont.) The cylinder rotates with angular speed corad/sec about an axis through 0
and (1, 1, 1). Find its kinetic energy.
3. Find the moments and products of inertia of the uniform hemisphere p < a,
z > 0 of mass M.
4. The axis of a uniform right circular cone of mass M is the 2-axis; its apex is at 0
and its base of radius a is at z h > 0. Find its moments and products of inertia.
5. Find the moments and products of inertia of the uniform rectangular solid of
mass M bounded by the planes x = a, y = b, z = =tc.
6. Find the moments and products of inertia of the uniform rectangular solid of
mass M bounded by the coordinate planes and the planes x = a, y = b, z = c.
7. The circular disk y = 0, (x A )2+ z2 < a2, 0 < a < A is revolved about
the 2-axis. Suppose the resulting anchor ring (solid torus) is a uniform solid of
mass M. Find I zz.
8. In Example 5.1, use symmetry to prove, without integrating, that
- i n
p2dV. Then evaluate the integral.
456 U. MULTIPLE INTEGRALS
9. Find I zz for the uniform solid paraboloid of revolution of mass M bounded by
az = x2 + y2 and z = h.
10*. Find the moments of inertia of the uniform solid ellipsoid of mass M,
i 2_l t . i zl = i
a2_r 62_r c2
[Hint: Stretch a suitable amount in each direction until the solid becomes a
sphere; set x = ua, y = vb, and z = wc.]
11*. Find I zz for the uniform solid elliptic paraboloid of mass M bounded by
x2 . y2
z ------- b 7" , z = h.
a b
[Hint: Use the result of Ex. 9 and the hint of Ex. 10.]
12*. Find I zz for the uniform solid of mass M bounded by the hyperboloid of revolution
z2 1
= {x2 y2) 1and the planes z = dzh.
cl or
13. Find the moments of inertia of the uniform solid of mass M in the region
a < x, y, z < a, x2 + y2 + z2 > a2.
14. Find the moment of inertia of a uniform solid right circular cylinder about a
generator of its lateral surface. (This is important in problems concerning rolling.)
15*. Suppose a rigid body R is rotating about an axis through 0 with angular velocity
co. Show that the angular momentum, J = / / / [5 (x)x X v] dV (v is velocity)
is given by
J = (co*, coy, uz)
Ixx Ixy Ix
lyx I yy
^zx I zy
The definitions of moments and products of inertia can be easily modified to apply to
wires and laminas rather than solids. Such moments can be computed directly, or some
times by limit arguments.
16. Find the moments of inertia of the uniform circular disk x2 + y2 < a2, z = 0 of
mass M. (The units are gm and cm.) [Hint: Let h --------->0 in Ex. 1.]
17. A uniform wire of mass M lies along the 2-axis from z = h to z = h. Find its
moments of inertia. (The units are gm and cm.) [Hint: Let a -------- >0 in Ex. 1.]
18. A uniform rod of length L and mass M lies along the positive z-axis on the interval
a < x < a + L. Find Iyy.
19. Find the moments of inertia of a uniform spherical shell (surface) of radius a
and mass M about an axis through its center.
20. A uniform circular wire hoop x2 + y2 = a2, z = 0 has mass M. Find its moments
of inertia.
21. Find I zz for the toroidal shell, the surface of the solid torus in Ex. 7.
22. Find the moments of inertia of the uniform cylindrical shell x2 + y2 = a2,
h < z < h.
12. Integration Theory
1. INTRODUCTION
I n this chapter we shall give the theoretical background for the techniques
of multiple integration developed in the last two chapters. We shall concentrate
mainly on the double integral, both because it is easier to visualize the plane
than space, and because no really new ideas are involved in doing the three-
dimensional theory once we master the two-dimensional case.
First we must give an accurate definition of the integral, one broad enough
to apply at least to continuous functions. The idea will be to define the integral
for step functions in an obvious way, then to obtain the integral of a more
general function by squeezing the function between step functions.
Let us review briefly how we do this on the line. We work on a fixed closed
interval I = [a, 6]. A partition of I consists of a finite increasing sequence of
division points (not necessarily equally spaced):
A step function is a function s(x) on [a, 6] that is constant on the open
intervals of a partition (Fig. 1.1). Thus
a = xo < Xi < < < xn = b.
s(x) = Bi for Xi-1 < x < Xi.
y
Fi g . 1.1
x
a Xo X] X'i xz xAxh=b
The integral of s(x) is defined by
n
s(x) d x = s(x) dx = Bi (xi Xi-i)
\ J a
Now let f ( x) be any function with domain I, arbitrary except we insist
that / is bounded, \ f ( x ) \ < M . We call / integrable if for each e > 0 there
458 12. INTEGRATION THEORY
exist step functions s and S on I such that
(1) s(x) < f ( x ) < S( x) for all x I
(2) / S( x) dx s(x) dx < e.
J | J j
Next we show that if / is integrable, then there exists a unique number, called
/
f ( x) dx,
such that
J s(x) dx < j f ( x) dx < J S( x) dx
for all step functions s(x) and S( x) that satisfy (1).
We prove that a continuous function is integrable, using its known uniform
continuity to approximate it by step functions.
This then is the way integrals are developed on the line; we shall use the
same approach in the plane. I t is by no means the only way to achieve a theory
of integration; we like it because it goes quickly, by easy natural steps. Also,
the definitions, theorems, and proofs, with only minor modifications, cover
the one-variable case. Therefore you do not have to know the theory of
integration on the line to read this chapter.
2. STEP FUNCTIONS
We fix a closed rectangle I in the plane,
I = { (x, y) \ a < x < b, c < y < d)
and denote its area by 111. Thus
|l| = (b a) (d c).
A partition of I is a decomposition of I into a finite number of sub-rectangles by
lines parallel to the axes. Thus we have partitions of [a, 6] and [c, d],
a = x0 < Xi < x2 < < xm = b,
c = 2/o< Vi < 2/2 < < yn = d,
defining the partition of I. See Fig. 2.1. We denote the individual rectangles
of the partition by
I j k = { ( z , y) I Xj - 1< x < Xj , yk- x < y < yk),
j = 1, 2, , m, k = 1, 2, , n.
2. Step Functions 459
2/= d
2/n-l
y*
2/i
J l2_
u I n
a = x o 2
Fi g . 2.1 a partition of I
Xm-l
Step Function A step function on I is a real-valued function s(x, y)
with domain I such that
(1) si s bounded;
(2) there is a partition of I such that s is constant on the open interior
of each rectangle I # of the partition. That is, s ( x, y ) = Bj k for
xj-i < x < xj and yk. i < y < yk.
Note that s is unrestricted on the division lines of the partition (except that
s must be bounded).
I f s(x, y) is a step function relative to some partition, then clearly it is a
step function relative to any finer partition, a partition with more division
lines (Fig. 2.2).
(a) partition
(b) finer partition
(refinement)
Fi g . 2.2
460 12. INTEGRATION THEORY
We want to handle sums, products, and other combinations of step func
tions. But two different step functions may be associated with different
partitions, so what do we do? We superpose the partitions (Fig. 2.3); then
both functions are step functions relative to the new finer partition.
+
(a) partition (b) second partition (c) superposed
(common refinement)
Fi g . 2.3
We can use this construction to prove routinely the following result (see
Exs. 1-4).
Theorem 2.1 Let Si and s2be step functions on I and k a constant. Then
each of the following is a step function:
Si + s2, ksij |Si |, S1S2
S(x, y) = max{si(z, y), s2(x, y ) ) ,
s ( x, y ) = min{si(z, y), s2(x, y) ) .
We are now ready to define the integral of a step function.
Integrals of Step Functions
Int egr al of a Step Function Let s be a step function on I relative to a
partition of I into rectangles !jk. Suppose s(x, y) = Bjk for (z, y) in the
interior of I j k. Define
J J s d x d y = ^ Bjk | l yfc|.
I j,k
(Recall that 11^1 denotes the area of I#.)
This definition looks innocent, yet it contains two subtle points. The first is
that the same step function s(x, y) may be associated with two different
partitions, Pi and P2. Do we get the same answer when we compute the inte
gral relative to Pi and relative to P2? Let us show that we do.
Suppose that J is a sub-rectangle of Pi and that s(z, y) takes the constant
value Bj on J. I f P3is the superposed partition (common refinement) of Pi
and P2, then J decomposes into a number of sub-rectangles J / of P3, and their
areas add up to the area of J. On each, s(x, y) has the same constant value Bj .
r i r
1 11
i 1 t
1 1 1
1 1 1
i i 1
1 1 1
=
I
l
-1
-11
2. Step Functions 461
Therefore,
J J s(x, y) dxdy = ^ Bj |J| = ^ Bj ^ |J<3|
(ft) j
J i,i (ft)
By the same reasoning,
J J s(x, y) dxdy = J J s(x,
(ft) (ft)
? /) dx d? /.
Hence,
J J s(x, y) dx dy = J J s(x, y) dx dy.
( p i ) ' ' ' (p2)
The second subtle point is this. Is there anything to be gained by using
more general partitions of I into sub-rectangles than those arising from par
titions of [a, 6] and [c, d~\l The answer is no, as Fig. 2.4 indicates.
Now we state the main elementary properties of the integral of a step
function. Except for the last two, their proofs are easy and are left as exercises.
Theorem 2.2 Let sh s2, s be step functions on I. Then:
(1) J J (si + s2) dx dy = J J Si dx dy + J J s2dx dy.
i i i
(2) J J ks dxdy = k J J s dx dy (k any constant),
i i
(3) I f s > 0 on I, then J J s dx dy > 0.
i
(4) I f Si < s2on I, then J J sxdx dy < J J s2dx dy.
i i
(5) I f I is partitioned into sub-rectangles I#, then
J J s dxd y = I f f s dx dy.
I j ,k I j k
(6) max | J J si dx dy, J J s2dx dy J < J J max{si, s2}dx dy.
i i i
J J minfsi, s2}dx dy < min I^ J J sxdx dy, J J s2dx dy j . (7)
462 12. INTEGRATION THEORY
(a) more general partition (b) the segments extended
Fi g . 2.4
To prove (6), note that Si < max{$i, s2}on I, hence / / - / / max{si, s2} by (4). Likewise consequently max J J S i < J J max {Si, S2J ; The proof of (7) is similar. Remark: Note the meaning of (5). The given partition of I is not assumed to be related to a partition attached to the given step function. Iterati on We complete the story of step functions with the iteration formula. Theorem 2.3 Let s(x, y) be a step function on 1, attached to the partition a = x0 < xi < < xm = b, c = y0 < yi < < y n = d. Define t (x) by , f d / s(x, y) dy for x xj t(x) = < J c arbitrary for x = Xj . Then t(x) is a step function on [a, b] and J J s dx dy = 1 = J t d x - J ^ J s(x, y) d y j dx. Proof: As usual, let I/* = {(x, y) | Xj.i < a; < xjy yk_x < y < yk) be the sub-rectangles of the partition and let s(x, y) = Bjk for (x, y) in the open interior of ly*. For xy_i < x < Xj we have 3. The Ri emann Integral 463 n = J' s(x, y) dy = ^ Bjk(yk - yk- 1). t(x) k = 1 This proves that <(x) is constant on the open interval (zy_i, X y), hence is a step function on [a, 6]. By definition, m n J t (x) dx = V (V Bjk( yk - 2/t-i)^(Xj - Xj. ;=1 A: =1 = ^ ^ Bj k I lyjbj = J J s dx dy. i) EXERCISES Prove in detail Theorem 2.1 for 1. Si + S2 3. |Sl| Prove Theorem 2.2: 5. part (1) 7. part (3) 9. part (5). 3. THE RIEMANN INTEGRAL Let / be a bounded real-valued function with domain I. We shall try to squeeze / between step functions whose integrals are arbitrarily close to each other. This is not possible for every function, but when it is, we are able to define the integral of /. Since / is bounded, there exist step functions s and S such that s < / < S on I. Indeed, if | f\ < B on I, take for instance the constant functions s = B and S = B. 2. S( x, y) 4. sis2 6. part (2) 8. part (4) Int egrabl e Functi on A bounded function / on I is called integrable if for each e> 0, there exist step functions s and S on I such that (1) s < f < S on I, (2) J J S dx dy J J s dx dy < e. 464 12. INTEGRATION THEORY The first thing we must do is show that if / is an integrable function, we can assign to/a unique number, its integral. Theorem 3.1 Let / be integrable on I. Consider all step functions s and S such that s < f < S on I. Then sup J J s dx dy = inf J J S dx dy. The common value of the sup and inf is called the integral of / on I, and is written I f i / dx dy. Proof: I f s < / < S on I, then J J s dx dy < J J S dx dy i i by Theorem 2.2, part (4). Hold S fixed and let s vary. I t follows that sup J J s dxdy < J J S dx dy. The left side of this inequality is independent of S. Now let S vary; it follows that (1) sup / / s dx dy < inf J J S dx dy. This inequality holds for any bounded function. Next we show that if / is an integrable function, then the reverse inequality also holds, so (1) becomes an equality. Let / be integrable. Given e > 0, there exist step functions s0and S0 such that So < f < So and J J Sodxdy < e + J J s0dx dy. inf J J S dxdy < J J So dx dy < e + J J s0dx dy < e+ sup J J s dx dy. Therefore i This is true for all e> 0, hence (2) inf J J S dxdy < sup J J s dx dy. By combining (1) and (2) we have the desired equality. R e m a r k 1: The theorem implies that if / is integrable and s < f < S , then J J s < J J f < J J S . This relation determines J J f. I n fact, J J f is the on ly number between all pairs J J s and / / * R e m a r k 2: The integral we have defined is called the Riemann (double) integral, and integrable functions are also called R-integrable. I t is possible to define integrals that apply to more functionsthe most important is the Lebesgue integralbut their theories are more advanced. 3. The Ri emann Integral 465 Theorem 3.2 Let / be a step function on I. Then / is integrable and the i? -integral of / is the integral defined in Section 2. I n particular, if / = B is constant on I, then / / f B |l|. Proof: Given a step function / and e> 0, take S = s = f in the defini tion of integrable function. Then s < f < S and / / - / / S , etc. Continuous Functions I t is an important practical matter that all continuous functions are integrable. This is certainly not obvious from the definition. Theorem 3.3 Each continuous function on I is integrable. Proof: Let / be continuous on I. By the theorems in Chapter 6, Section 3, / is bounded and is uniformly continuous on I, since I is a bounded and closed set. Because / is bounded, we are allowed to ask whether or not it is integrable. Thus let e> 0. Since / is uniformly continuous, there is 5 > 0 such that | / ( x ) / ( z)| < e/\\\ whenever |x z| < 8. Choose a partition with sub-rectangles I # so small that |x z| < 8 when ever x and z lie in the same I#. Set mj k = inf{ /(x) | x 6 Ij k), Mj k = sup{/(x) | x ly*}. Then mjk < /(x) < Mj k for all x 6 Iy*, and Mjk mjk < /|l|. 466 12. INTEGRATION THEORY Now define step functions s and S by requiring that s(x) = mjk and S(x) = Mjk for x in the open interior of I#, and s(x) = S( x) = f ( x) for all points x on the division lines of the partition. Then s < / < S on I and J J S dx dy J J s dx dy = J J ( S s) dx dy h i i i <I / j ^ d x d y =Tjj-|l| =. r Therefore / satisfies the definition of an integrable function. Sums and Multiples Theorem 3.4 Suppose / i and / 2 are integrable on I and Ci and c2 are con stants. Then Ci /i + c2/2is integrable and J J (ci fi + Cift) dx dy = ci J J fi dx dy + c2J J / 2dx dy. i i i Proof: We divide the proof into several steps. (1) I f / is integrable, then / is integrable and J J (/) = J J f. For let e > 0. Choose s and S so s < / < S and / / - / / s < . Then S < f < s and (- S ) = / / - / / s < e, etc. (2) I f / is integrable and c > 0, then cf is integrable and J J (cf) c J J f. This time choose s and S so s < / < S and J J S J J s < e/c. Then cs < cf < cS and J J cS J J cs = c ( / / - / / ) < e, etc. (3) I f / is integrable and c is a constant, then cf is integrable and / / < < - / / / . We have just disposed of c > 0. I f c < 0, then cf = ( c) (/) does it in two steps. The case c = 0 is obvious. (4) I f fi and / 2are integrable, then fi + / 2is integrable and / / ( / + /O - / / ' + / / / - Given e > 0, choose step functions satisfying si < jfi < Si,$2 < J 2< S2,
Then Si s2< /1 -(- / 2< Si + S2and
/ / (A + S.) - j f <* + ,) - ( / / S. - / / ) + (' J j S , - j j a )
,
2 2 = *
etc. Now given integrable functions/1 and/2, it follows from (3) that C1/1 and
(hfi are integrable, and from (4) that Ci/i + c2/ 2is integrable. Again using
(3) and (4) we have
/ / ( * / > + <*/ ) = c>/ / / . + / / &
Inequalities
Theorem 3.5 Suppose / and g are integrable on I.
(1) I f / > 0 on I, then J J f dx dy > 0.
1
(2) I f / < g on I, then J J f dx dy < J J g dx dy.
1 1
(3) I f m < f < M on I, then m |l| < J J f dx dy < M |l|.
1
Proof: For (1), simply observe that s0 = 0 is a step function and s0 < f
on I. By definition, J J s < J J f for any step function s satisfying s < /, in
3. The Riemann Integral 467
468 12. INTEGRATION THEORY
particular for s0. Now (2) follows by applying (1) to g f. Finally, (3) is
immediate from the choice of s = m and S = M onl .
For the next theorem we need a simple fact, which we state as a lemma.
Lemma Let Ai, A2, Bi, B2 be four real numbers. Then
max{i? i, B2\ max{Ai, A2) < max{l? i Ah B2 A2).
I f in addition Bi > Ai and B2 > A2, then
max{J 5i, B2) maxjAi, A2\ < (Bi Ai ) + (B2 A2).
Proof: For i = 1, 2,
Bi = Ai + (Bi At) < max{Ai, A 2) + max{i Ai, B2 A2}.
Here max{i? i, B2) < (right-hand side), and the first inequality follows.
I f Bi > Ai and B2 > A 2) then Bi Ai > 0, so B2 A2 < (Bi A x) +
(JB2A2). Likewise Bi A\ < (B\ A\ ) + (B2 A2), so
maxfjBi Ai, B2 A 2\ < (Bi Ai ) + (B2 A2).
The second inequality now follows from the first.
Theorem 3.6 Suppose fi and f 2 are integrable on I. Then max{ f h f 2} and
min{ f i yf 2} are integrable and
(1) max I f f f i dx dy, J J f 2dx dy\^ < J J max{ f h f 2} dx dy,
i i i
(2) J J min{ f h f 2) dx dy < min ^ J J fi dx dy, J J f 2dxdy .
Proof: Let e > 0 and choose step functions satisfying Si < fi < Si 9
< f 2 < S2y J J Si J J Si < and J J S2 J J s2 < %e. Set S =
max {Si, S2] and s = max{i, s2}, both step functions. Clearly s <
maxj /i ,/2}< S.
We shall prove first that J j S J J s < e. This will imply that max {/i,/2
is integrable. By the lemma,
S - s < (Si - si) + (S2- * ).
By (4) of Theorem 2.2,
J J (S s) < J J (Si Si) + (S2s2) < \e ~b he =
Hence F = max) /i, / 2} is integrable. Since fi < F and / 2< F, we have
/ / / / F and / / - / / F by (2) of Theorem 3.5. Relation (1) fol
lows immediately.
Relation (2) is equivalent to (1) by the equation
min {fi , /2}=- max{- /i , /2}.
3. The Ri emann Integral 469
Proof: To prove |/| is integrable, we use the relation (easily verified)
|/| = maxj /, 0} - min{ /, 0}
and the last theorem. Since / < | /| and / < |/|, by Theorem 3.5 we have
j j f <j j l / l a n d - / / / < / / I / I H e n c e \jjf\< j j | / | .
R e m a r k : Why not prove this theorem directly, starting with s < f < S,
etc.? Because even though we find easily that | / 1< maxj |s|, \ S\}, it is hard to
find a step function below |/|. Certainly min{|s|, \S\} < \ f | is wrong!
Addi t i vi t y of the Integral
Theorem 3.8 Suppose /i s a bounded function on I, and suppose I is parti
tioned into sub-rectangles I#. Then / is integrable on I if and only if / is
integrable on all I#. I f so, then
/ / f d* dv - X I f f dx dy.
j ,k I jk
Proof: The theorem is true when / is a step function, by Theorem
2.2(5). We shall use this fact.
First suppose / is integrable on each of the mn sub-rectangles I#. Let
c > 0. Choose step functions satisfying sjk < / < Sjk on I jk and
(1)
Define S and s on I by
S = s = f
f i S j k I I S i k < mn
Ijk I jk
S = Sjk and s = Sjk
on the dividing lines,
on the interior of I/*.
470 12. INTEGRATION THEORY
Then s < / < S on I and
I I > - l l - 111- I I I -
I I j , k I jk j , k I jk
i n - i n -
j ,k I jk j ,k I jk
m > - n - ) U -
j , k I j k I jk j , k
Therefore / is integrable on I, and
(3) f f s < f f , < f f S .
I I
But s < / < S on each I so
a - - i n i a - i a - i i -
1 j ,k I j k j ,k \ j k j ,k \ j k I
Compare the results of (2), (3), and (4):
/ / ^ / / / / -
i i jk i
I t follows readily that
\ U - i i i ' \ < -
i i jk
Since eis arbitrary, (6) implies
n > - i i b
I I jk
Now to prove the rest of the theorem, suppose / is integrable on I. I t is
enough to prove / is integrable on each I#, for then the first part of the proof
will give the sum formula.
Let e> 0. Choose s and S so s < / < S on I and 8 - S < .
/ / - / /
Then s < f < S on \Jk and
f f s- f f . zf j v- . x.
\ j k I ji I
Hence / is integrable on I#, which completes the proof.
Products
3. The Riemann Integral 471
Theorem 3.9 I f / and g are integrable on I, then fg is integrable.
Proof: The identity
f g = \ [(/ +sO2- (/- ? )2]
shows that it is enough to prove that the square of an integrable function is
integrable. Let / be integrable. Then f 2 = | f \ 2 and |/| is integrable. Hence we
may assume f > 0. Also / is bounded, so 0 < / < B.
I f
I f -
Let 6> 0. Choose step functions 0 < s < f < S < B and
s < c/2B. Then s2 < f 2 < S2 and
S2 - s2 = ( S + s) ( S - s) < 2B( S - s),
hence J J S2 J J s2 < 2B J J ( S s) < c. This proves/2is integrable and
completes the proof.
R e m a r k : A much more general theorem is true, but its proof is very hard:
I f H( z, w) is continuous for all z, w and if / and g are integrable on I, then the
composite function
h(x, y) = H [ f O, y), g(x, y ) ]
is integrable. The special case H( z } iv) = zw implies Theorem 3.9.
EXERCISES
1. Suppose a function / on I has the following property: if c > 0, there exists a step
function s such that
|f(x, y) s(x, y )| < c for all (*, y) I.
Prove / is integrable.
2. Suppose g(x) is integrable on [a, &]. Set / (x, y) = g {x) on I : a < x < b, c < y < d.
Prove/is integrable on I.
3. Define / by f ( x, y ) = 1if and t/ are both rational, / = 0 otherwise. Prove that/
is not integrable on any rectangle.
4. Suppose / is integrable on a < x < b, c < y < d. Define g(x, y) = f ( y, x) .
Prove that g is integrable on c < x < d, a < y < b. Find J J & -
5. (cont.) Suppose / is integrable on 0 < x, y < 1and f ( x, y ) - { - f ( x, y ) = 0. Prove
l f = o.
472 12. INTEGRATION THEORY
/ / '
6*. For each n, divide I into n2equal rectangles I Choose any point i/yjfc) 6 ly*
and set
Prove An-------- >J j f if / is integrable. [Hint: First prove it for step functions.]
7. Let f(x, y) = 0 on a rectangle I except at a single point p. Show that/is integrable
and J J f dx dy = 0.
I
8*. Let/(:r, y) be defined in the square I with vertices (0, 0), (0, 1), (1, 1), (1, 0) by
f f a y ) =
f f a h yjk).
i ,k
2) 3) 4>
0 otherwise.
Prove that / is integrable and f dx dy = 0.
/ / .
I
If-
9. Suppose/(x, y) > 0. Exercises 7 and 8 show that J J f dx dy = 0 does not imply
I
that f(x, y) = 0 at each point of I. Prove, however, that if / is continuous, then
/ /
f dx dy = 0 does imply that f ( x, y) = 0 at each point of I.
4. ITERATION
Here is the main theorem on iteration of integrals over a rectangle
I = { y) | a ^ x ^ b, c < y < d).
4. Iteration 473
Theorem 4.1 Let / be a bounded function on the domain
I = { (x, ?/) \ a < x < b, c < y < d).
Suppose:
(1) /i s integrable on I.
(2) For each x [a, 6], with possibly a finite set of exceptions, the
integral
g(x) = J' f i x, y) dy
exists.
Then g is integrable on [a, 6] and
f bgdx = f ( f /(* ,)
J a J a \ J c
< * ) * - / /
/ dx dy.
Proof: Let > 0. Choose step functions satisfying s < f < S and
/ / - / /
s < e. Define step functions
t i x) = f ' s i x, y ) dy , Ti x) = J Si x , y ) d y .
Then for each x, except perhaps the finitely many exceptions, the inequality
8 < f < S implies
t{x) < g(x) < T{x).
By Theorem 2.3,
J T(x) dx J t(x) dx = J J S dx dy j j s dx dy < c.
This says that g{x) is integrable on [a, 6]. I t also says that
J J s dx dy < J g(x) dx < J J S dx dy
for each choice of s and S. But the only number that satisfies these inequalities
i s / / / . for all s and S is /. I t follows that
completing the proof.
J gi x) dx = J J f dxdy ,
474 12. INTEGRATION THEORY
R e m a r k : Hypothesis (1) is not sufficient for this theorem. There are
examples of functions that satisfy (1) but not (2), so g(x) is not even defined,
let alone integrable.
Corol l ary I f / is continuous on I, then
j j f d x d y = a i : f ( x, y) d y j dx = M f i x, y) dy.
Proof: By Theorem 3.3, / is integrable on I. Because/ is continuous on I,
for each x0 the function f ( x0, y) is continuous on [c, d], hence integrable. Thus
the hypotheses of Theorem 4.1 are satisfied, so the first equality follows. By
symmetry so does the second.
Non-rectangul ar Domains
This completes the theory of the double integral over a rectangle. Our
next job is to extend this theory to functions with non-rectangular domains.
The first problem is to decide which sets we shall allow as domains. This
is not an easy question; the complete theory is beyond the scope of this course.
We want a theory that includes at least the domains that arise in practice, but
not so many more that the technical details become oppressive.
The kind of domain D we want can be described accurately this way.
(1) I t is a closed bounded subset of R2. (2) I t is connected, i.e., any two points
of D can be connected by a curve in D. (3) I t has many interior points (that is,
each point of D is either an interior point or a limit of a sequence of interior
points). (4) I ts boundary consists of a finite number of arcs that have con
tinuously turning tangents and a finite number of corners (Fig. 4.1a).
(a) smooth boundary (b) convex domain (c) jigsaw puzzle
except for a piece
few corners
Fi g . 4.1 domains of integration
Many of the domains we deal with are convex (Fig. 4.1b). A domain is
convex if it contains the line segment connecting any two of its points. Prob
ably the most complicated domains arising in practice are shaped like the
pieces of a jigsaw puzzle (Fig. 4.1c). Still, we can define the double integral
4. Iteration 475
of a continuous function on such a domain. We first cut the domain into a
finite number of simple domains, then use iteration and the techniques
developed in Chapter 10 to evaluate the integral on each of these.
The double integral can be defined on domains that have area in a sense we
define now.
Area Let D be a closed subset of the rectangle I. Define the characteristic
function kD of D by
The set D is R-measurable if kD is integrable on I. I f so, the area of D is
defined by
R e m a r k 1: Note that an /^-measurable domain is automatically a
closed set by this definition. Nevertheless, we shall stress this closure in the
following theorems.
R e m a r k 2: I t will be left as an exercise to prove that if I and J are two
rectangles, both containing D, then |D| is the same whether computed in I or
in J. This is an easy, but necessary, result in order that |D| be properly defined.
The definition makes good sense. First it gives the right answer for any
rectangle J in I (with sides parallel to the axes). Second, it is additive. This
means that if two sets D and E are R-measurable and they do not intersect,
then their union D U E is R-measurable and | DUE| = |D| + |E|. This state
ment is contained in the next theorem.
N o t a t i o n : At this point it is convenient to introduce the symbol 0
for the empty set, the subset of I with no points. This set has the characteristic
function k0 = 0, it is R-measurable, and |0 | = J J 0 = 0. The statement
D and E have no common points is abbreviated simply D n E = 0 .
Theorem 4.2 Suppose D and E are closed R-measurable subsets of I.
(1) D D E is R-measurable.
(2) DUE is R-measurable.
(3) |D U E| = |D| + | E| - |D fl E|.
Ar ea
1 if (x, y) D
( x, y) =
0 if (x, y) $D. 476 12. INTEGRATION THEORY Proof: First we observe that kDf]E = kDkE. For k DnE(x) = 1 precisely when x f D n E, that is, x D and x f E, which happens precisely when fcD(x) = 1and feE(x) = 1, that is, fcD(x)fcE(x) = 1. Now (1) follows because /cDflE is the product of integrable functions, hence integrable. For (2) we need the relation &dUE ~ ^1DflE* I t is easily checked, because the right-hand side is 1 precisely when x is in D and not in E (1 + 0 0), or x is in E and not in D (0 + 1 0), or x is in both (1 + 1 1), that is, when x f D U E. Now (3) follows from |D U E| = J J /cDUE = J J kD + J J kE J J kDf]E = |D| + | E| |D fl E|. The next theorem includes a familiar situation, the area under a curve (the case c = 0). Theorem 4.3 Let h(x) be a continuous function on [a, with c < h(x) < d. Define D by D = {(x, y) | a < x < by c < y < h( x) }. Then D is R-measurable and 'b - rJ a [h(x) c] dx. Fi g . 4.2 The set (shaded) below the graph of h is D. 4. Iteration 477 Proof: First suppose h is a step function on [a, &]. Then kD is a step function on I. See Fig. 4.2. Hence kD is integrable and J J kD = J ( J kD(x, y ) dy'j dx. For fixed x, we have kD(x, y) =1 if c < y < h(x) and kD(x, y) = 0 if h(x) < y < d. Hence ra kf)(x, y) dy = h(x) - c, rJ c SO f h = i : \_h(x) c] dx. Now suppose h is continuous on [a, 6]. Then D is closed (why? ) and we must prove D is immeasurable. Given e> 0, there are step functions s and S on [a, 6] such that c < s < h < S < d and f ( S s) dx < e. J a To estimate kD, we construct the sets C = {(x, y) | a < x < 6, c < y < s ( x ) } and E = { (x, y) \ a < x < b, c < y < S( x) } . Clearly C C D C E, so kc < kD < kE. As we have seen, kc and kE are step functions, and [aS(x) c] dx J [s(z) c] dx = f [ S ( x ) s ( x ) ' ] d x < e. J a This proves that kD is integrable on I, hence D is R-measurable, and also that J [() c ] d x < J J kDd x d y < J [ S ( x ) c] dx. Since J J kD = |D|, this relation says that for all s and S such that s < h < S. But the only number satisfying these in equali ti es is J (h c), hence |D| = J (h c). As a corollary, we obtain the area of a region bounded above and below by the graphs of two continuous functions. 478 12. INTEGRATION THEORY Theorem 4.4 Suppose g( x) and h(x) are continuous functions on [a, 6] with c < g(x) < h(x) < d. Define D by o = {(x, y) I a < X < b, g(x) < y < h ( x ) }. Then D is R-measurable and - f J a \_h(x) g(x)~\ dx. Proof: This is a consequence of the last theorem, applied twice, and the relation (Fig. 4.3) D = {{x, y) | c < y < h ( x ) } n {(*, y) \ g{x) < y < d ) . R e m a r k : The preceding results are adequate for proving that standard geometric figures, like polygons and circles, have area. With a little patience one can now derive the usual area formulas for triangles, parallelograms, etc. Integrals Let D be a closed subset of a rectangle I and let / be a bounded real valued function with domain D. We seek conditions under which J J f d x d y D can be defined. A first modest requirement is that D should be R-measurable. y - h(x) 4. Iteration 479 Then at least J J 1dx dy = J J kDdx dy D I makes sense. We now need a reasonable definition of integrability on non- rectangular closed domains. Int egrabl e Function Let D be a closed R-measurable subset of I, and let / have domain D. Define F on I by F = f on D, F = 0 elsewhere. We say / is integrable on D if F is integrable on I, and we define J j f dxdy = J J F dx dy. D I I t can be shown that this definition is independent of the rectangle I. Theorem 4.5 Suppose D is a closed R-measurable subset of I, and suppose / is integrable on the rectangle I. Then / is integrable on D. *Proof: The product F = kDf of two functions integrable on I is itself integrable. But clearly F = f on D and F = 0 outside D, so this integrable function F is exactly the F of the definition. Hence/is integrable on D. The General It erati on Theorem Theorem 4.6 Let g(x) and h(x) be continuous functions on [a, 6], with c < g(x) < h(x) < d. Define D as the subset of I bounded by the graphs of g and h: D = { (x, y) | a < x < b, g(x) < y < h(x) }. Let f(x, y) be continuous on D. Then / is integrable on D and / / ^ x' v) dx dy = m : f ( x, y) dy j dx. Proof: We begin by constructing a continuous function /* on I such that /* = / on D. The idea is to make /* constant on the vertical segments above and below D. Thus we set f *( x, y) = fi x, g( x) 2 f(x, y) for fx, h(x)~\ c < y < g(x) g{x) < y < h(x) h(x) < y < d. 480 12. INTEGRATION THEORY By definition, /* agrees with / on D. The proof that /* is continuous is routine and is left as an exercise. By Theorem 3.3, /* is integrable on I, so Theorem 4.5 implies / is integrable on D. Let F = f on D and F = 0 outside D. By definition, J J f dx dy = J J F dx dy. For each fixed x, j : F(x, y) dy exists because for fixed x, F(xj y) = 0, c < y < g(x) f (x, y)> g(x) < y < Hx) 0, h(x) < y < d, a step function on the two end intervals [c, g(x)~\ and [ h( x), d] and a con tinuous function on the middle interval \_g(x), h(x) ]. Therefore the hypotheses of Theorem 4.1 are both satisfied by /, so we conclude j j f d x d y = f i F d x d y = I ' M ' F d y j dx. But rd rg{x) rh{x) rd rh(x) / F dy = I F d y + F d y + F d y = f (x, y) dy, J c J c J g(x) J h(x) J g(x) The theorem follows. Theorem 4.7 Suppose D is a closed R-measurable subset of I, and suppose / is a continuous function with domain D. Then / is integrable on D. A complete proof of this theorem, with the tools we have developed, is possible but arduous. I nstead we shall give a short elegant proof, for which we borrow an important and plausible result from an advanced course on real functions, the Tietze Extension Theorem: There exists a continuous function /* with domain I such that /* = / on D. With this assumed, the proof of Theorem 4.7 is very short indeed: Since /* is continuous on I, it is integrable. Apply Theorem 4.5. Done! R e m a r k : The Tietze Theorem stated precisely is this. Let D be a closed subset of R2 (also for R3, or Rn in general). L et/be a continuous real-valued function on D. Then there is a continuous function /* on R2 such that /* = / on D. 4. Iteration 481 Addi t i vi t y We close this section with a broad generalization of Theorem 3.8. First we require a more general definition of partition. Parti ti on Let D be R-measurable. A partition of D consists of R-measur able sets Di, , Dn such that (1) D = Di UD2 U UDn, (2) |D n Dj \ = 0 if i j. Theorem 4.8 Suppose D is partitioned into Di, , Dn, and suppose/is a bounded function on D. Then / is integrable on D if and only if / is inte grable on each Dy. I f so, then // f dxdy =I f f f dx dy. / = 1 Dy Proof: Let the characteristic function of Dy be /cy, an integrable func tion by hypothesis. Suppose / is integrable on D. Then fkj is integrable on D, which means / is integrable on Dy since Dy C D. Now we compute Clearly m ^ kj = kD + e, 3=1 where the value of the error function e = e(x) is one less than the number of sets Dy containing x. Since e(x) >0 only if x belongs to two or more of the Dy, and |D* fl Dy| = 0, we have j j e < (m 1) ^ |D{ n Dy| = 0. D ij*j j j fe < (max/) j j e = 0, hence Therefore D D But on D, n f = f kD = ^ fkj + fe, j -1 so j = 1 Dy ;= 1 D 482 12. INTEGRATION THEORY Now suppose / is integrable on each Dy. Then the same relation, / = ' Ef kj + fe, implies / is integrable on D. This completes the proof. 1. In the proof of Theorem 4.1, we omitted to prove that g (x) is bounded. Supply this missing step. 2. Prove the assertion about two rectangles in Remark 2 on p. 475. 3. Prove that the area of a line segment is 0. 4. Prove, on the basis of Theorem 4.4, that a triangle with vertices 0, {x\, yi), and (%2, 2/2), where x\ < 0 < X2, has area \ \x1y2 22/11- 5. (cont.) Prove that a triangle with vertices {x\, yi), (z2, 2/2), (3, 2/3) has area Let us begin with a review of the one-variable situation. Theorem 5.1 Suppose <t>(u) is continuously differentiable on [a, b] and suppose 4> (u) > 0. Let f ( x) be integrable on [0(a), 0(b)]. Then /[0(w)]0' (w) is integrable on [a, b] and We shall not give a detailed proof of this theorem. We simply note that it is easily verified when / is a step function. The usual approximationwith some technical detailsthen proves it for / integrable. Let us rather interpret Theorem 5.1 in such a way that its analogue in more dimensions, Theorem 5.2, will seem natural. We write x = 4>(u), where u runs over the interval D = [a, b] and x runs over the interval E = [0(a), 0(b)]. Since > 0, it follows that x in creases as u increases. Thus 0 is a one-one mapping on the set D onto the set E, and we may write E = 0 ( D ) . With this notation, the formula of Theorem 5.1 can be written EXERCISES Xi 2/1 1 1 2 X2 2 / 2 1 Xz 2/3 1 6. (cont.) Prove A bh for a parallelogram. 7*. Prove that the function/* in the proof of Theorem 4.6 is continuous. 5. CHANGE OF VARIABLES 5. Change of Variables 483 or with x = x(u) instead of x = 0(w)> J f ( x) dx = J f[_x(u) ~\ ^du. A similar formula holds in the plane and in space. There 0 is a one-one mapping of some domain D of, say, R2 onto another domain E. We shall need some information about such mappings, in particular, what replaces the derivative dx/du. Jacobians Let us note some facts about mappings. Let S be an open set in the u, v- plane and 0 a continuously differentiable mapping of S into the x, y-plane. Write x = x(u, v) 0: y = y(u, v). Thus the four partials dx/ du, , dy/dv exist and are continuous on S. We define the Jacobian of 0 to be the determinant I t is a continuous function on S. Suppose 0 maps S into an open set T and \p in turn is a continuously differentiable mapping of T into the z, w-plane. The composite \p 00 is a mapping of S into the z, w-plane. Thus z = z(x, y) = z [ x( u} v), y(u, v)] w = w( x, y) = w[ x( u, v), y ( u} v)"]- By the Chain Rule, we have the matrix relation ~dz dz dz dz dx dx du dv dx dy du dv dw dw dw dw dy dy __du dv _ _dx dy _ J du dv_ Recall that the determinant of a product of matrices is the product of their determinants. Therefore 484 12. INTEGRATION THEORY d(z, w) d(z> w) d(x, y) d(u, v) d(x, y) d(u, v) Suppose in particular that <t>has an inverse and = <t>~l. Then </>is one-one on S onto T ; also \p is one-one on T onto S, and \p 0 = identity. Then so u = u(x, y) v = v(x, 2/), d(u, v) d(x, y) a(x, y) d(u, v) u = u[ x( u, v), y(u, y )] v = v[x(u, v), y(u, v )] , 1 0 0 1 = 1. A continuously differentiable mapping </>that has a continuously differen tiable inverse is called a regular transformation. We have proved: The J acobian of a regular transformation is never 0. Suppose the domain S of 0, in addition to being open, is connected. That is, any two of its points can be connected by a curve in S. Then the J acobian of a regular transformation has constant sign in S, either always positive or always negative. For otherwise it would have to be zero at some point along a curve joining points where it had opposite signs. We shall concentrate on the case where the sign is always positive. We shall call a regular transformation with everywhere positive J acobian a proper transformation. Examples Example 1. Translation: x = u v+ a d(x, y) 1 0 y = v + b d{u, v) 0 1 The transformation is proper on the whole plane. Example 2. Linear transformation: x = an u + a- uV d(x, y) an ai2 y = <kiu + 022^ d(u, v) 021 ^22 I f we assume the determinant is positive, then </>is a proper transformation on the whole plane. I f D is the unit square in the u, y-plane, then E = </>( D) is a parallelogram (Fig. 5.1). More generally, any rectangle in the u, u-plane is mapped to a parallelogram. 5. Change of Variables 485 Example 3. Polar coordinates: x = r cos 6 d(x, y ) cos 6 r sin 6 y = r sin 6 d( r, d) sin 6 r cos 0 This is a proper transformation on any domain D of the r, 0-plane that avoids the 0-axis. For instance, the rectangle D in Fig. 5.2 is mapped to the circular region E. x = r cos 6 Fi g . 5.2 y = r sind 486 12. INTEGRATION THEORY The last two examples illustrate the most common reason for changing variables, to change the domain of integration to a rectangle. Here is a fur ther such example, but in 3-space. Example 4. Spherical coordinates: / # x = p sin 0 cos 0 0: y = p sin 0 sin 0 dj x, y, z) d (p, <t>, 6) 2 = p COS 0, sin 0 cos 0 p cos 0 cos 0 p sin 0 sin 0 sin 0 sin 0 p cos 0 sin 0 p sin 0 cos 0 cos 0 p sin 0 0 sin 0 cos 0 cos 0 cos 0 sin 0 p2sin 0 = p2sin 0. cos 0 sin 0 0 (The determinant can be expanded easily by minors of the third row.) A rectangular solid D: 0 < p0< p < pi, 0 < 0 O< 0 < <t>i < 0 < 0O< 0 < 0i < 2r in p, 0, 0-space maps to a curved solid in x, y , 2-space. 77ie Mai n Theorem Theorem 5.2 Let S be an open subset of the u, y-plane and 0 a proper transformation on S into the x, ? /-plane. Let D be an R-measurable subset of S and E = 0(D) the image of D under 0. Then: (1) E is R-measurable. d (x, 2/) (2) I f / is integrable on E, then the product [ ( / 0) (u, v) ] r1- is in- d( u, v) tegrable on D and J J /(*, y) dxdy = J J f [ x( u, v), y(u, v) ] j du dv- E D A proof is hard and beyond the scope of this course. We shall note one case where the theorem is quite plausible, that of the linear transformation on a rectangle (Example 2 above). First suppose / = 1 on E, and look carefully again at Fig. 5.1. The formula in Theorem 5.2 boils down to | E| = | D| (#11022 O12O21) == O11O22 012^21 5. Change of Variables 487 This is certainly correct, since the area of the parallelogram E, by vector algebra, is I E| = |(an, 021) X (ai2, 022) | = an022 012021. I n fact any rectangle Dx in the u, v-plane is mapped to a parallelogram Ex with | Ei| = |Di| (011022 O12O21). Therefore if f (x, y) is a step function on E relative to a division of E by lines parallel to the sides of E, then / is a step function on D and (2) in Theorem 5.2 is again true. This suggests an approxi mation procedure for proving (2) for any integrable function on E, but we must omit the details. Once Theorem 5.2 is known for proper linear transformations on rectangles, it can be proved in general by another, deeper, approximation technique. This is to approximate the general proper transformation by a piecewise linear transformation; the J acobian plays the role of an area distortion factor. The complete proof is quite formidable. Pol ar Coordinates The polar coordinate mapping x = r cos 0 dj x, y) a d(r, 0) y r sin 0 \ / = r is a proper transformation in the open set S = {(r, 0) I 0 < r, 0 < 0 < 2r ) , and (Fig. 5.3) it maps S onto T = {(x, y) | y 9^ 0 or y 0 and x < 0}. Fi g . 5.3 polar coordinate transformation 488 12. INTEGRATION THEORY There is a problem with r = 0 because many typical integration problems include r = 0. See Fig. 5.4. Fi g . 5.4 including the origin The problem is solved by deleting a small circular sector and taking limits: y) dx dy J J f (x, y) dx dy = lim J J E "* E, r > -1 I f D, r > e f dx dy f ( r cos 0, r sin 0) r dr dd = J J f r dr dd. The process works because the excluded region (circular sector in E, rectangular strip in D) has area that goes to 0 as e -------- > 0. Similarly we can take care of a domain E that goes completely around the origin, so that it hits both horizontal boundaries of S in Fig. 5.3. I n summary, the formula fff d x d y - jf. d(x, y) d(r, e) -ff> f ( r cos 6, r sin 6) r dr dd dr dd holds without restriction on the R-measurable domain D. Analogous considerations apply to the transformation from spherical to rectangular coordinates: x p sin 0 cos 6 y = p sin 0 sin 6 2 = p cos <t> a(s, y, g) d(p, <i>, 6) p2sin 0. 6. Applications of Integration 489 EXERCISES Compute the J acobian and prove that the mapping is a proper transformation: 1. x = 2u + v, y = u + 2v, all (u, v) 2. x = u v, y = uv, u > v 3. x = u + v, y = u2+ v2, v > u 4. x u, y = uv, u > 0 5. x = u, y uv, z = u > 0, v > 0 6. x = vw, y = wu, z uv, u > 0, > 0, w > 0. 7. The transformation of inversion in the plane is defined by x -------- >x/|x|2 for x 9^ 0. Find its J acobian. 8. (cont.) Generalize to R3. 9. What domain in p, <j>, 0-space corresponds to the solid sphere |x| < e in x, y, 2-space? Show that its volume goes to 0 as -------- > 0. 6. APPLICATIONS OF INTEGRATION Leibniz Rul e I n this section we take up a number of important applications of multiple integration. The first concerns differentiation under the integral sign. Consider a function defined by a definite integral, = I f ( x>0 dx. J a Think of t as a parameter. When the variable x is integrated out, there remains a function of t. Problem: find the derivative F' ( t ) . The answer is called the Leibniz Rule, or the rule for differentiating under the integral sign. Lei bni z Rule Suppose f ( x , t) and the partial derivative f t (x, t) are con- tinuous on a rectangle a < x < b, c < t < d. Then ^ J f ( x, t) dx = J f t ( x, t) dx for c < t < d. Proof: For each t 6 [c, d~\ let Dt denote the rectangle = {(x, s) | a < x < b, c < s < t) in the x, s-plane. The idea is to evaluate G{t) = J J M x , . ) dxds Dt in two different ways, then to compute G' (t ). On the one hand, (1) G(t) = ( J f. (x, s) ds ) dx = J [ f ( x , t) - f ( x, c)] dx, where the inner integral is evaluated by the Fundamental Theorem of Calculus. On the other hand, (2) G(t ) = f(x, s) dx) ds. From (1), d d t b d f b d f h d t = d t j ^ X ^ dX ~ d t j ^ X ^ dX = d t j ^ X> dX From (2), G = J t J c ( / ^ x> ^ dx) ds = J ^ x> ^ dx The Leibniz Rule follows upon equating these expressions for G' (t ). 490 12. INTEGRATION THEORY EXAMPLE 6.1 . , d f wsin tx 1 Find - / -------dx at dt J0 x 2 Solution: d_ d t . When t = J , the value is (sin %t )/% = 2. d /7rsin tx [* d /sin t x\ [* / ------ dx = I I --------J dx = / cos tx dx = d t j 0 x J 0 d t \ x / J0 sin 7rt t A?iswer: 2. R e m a r k : I t is known that F (t) - J [(sin tx) / x ] dx cannot be expressed in terms of (a finite number of) the usual functions of calculus. I n other words, you wont find it in a table of integrals, except as an infinite series. Neverthe less, F(t ) is a perfectly good differentiable function. But to compute its derivative, you need the Leibniz Rule. Vol ume of an Ellipsoid EXAMPLE 6.2 Find the volume enclosed by the ellipsoid + + a2 1, ay b , c > 0. 6. Applications of Integration 491 S oluti on : Define sets D = { (u, v, w) | u2 + v2 + w2 < 1}, E = \ (x, y, z) - 2 + + * < l \ f a2 b 2 c 2 and the mapping 0: x = au, y = b v, z = c w. Then 0 takes D onto E, and 0 is a proper transformation (non-singular linear actually), so - / / / * > * * - / / / E D - / / / d(x, y, z) du dv dw d (u, v, w) ab c du dv dw = ab c |D|. But D is the solid unit sphere, so |D| = i w. An swer : i i r ab c . Surface Ar ea Our aim is to give a reasonable definition of surface area in R3. Up to now, a surface has always meant the graph of a function z = f ( x , y ) . A surface represented this way is called a non-parametric surface. There is, however, a more flexible concept called a parametric surface. Such a surface is the image under a nice mapping of a domain in the plane, just as a para metric curve is the image of an interval. We start with a reasonable domain D in the u, y-plane (parameter plane) and a differentiable function x = x(w, v) on D with values in R3. We assume that x maps D one-to-one on a set S C R3- Then S is a parametric surface. Each point of S is determined by the unique point of D from which it comes. Thus the point x(w, v) can be assigned the coordinates ( u, v) . I n general, the grid of coordinate lines u = const., v = const, is mapped onto a grid of coordinate curves on S. See Fig. 6.1. The horizontal coordinate line through a point (a, b) of D can be written u = a + t, v = b. The corresponding curve on the surface is x (0 = x(o + ty b). The velocity vector of this curve is dx Thus xwis tangent to the coordinate curve, hence tangent to the surface. A similar statement holds for x^. 492 12. INTEGRATION THEORY Fi g . 6.1 parametric surface We now make the assumption that xu and x v are non-collinear at each point of S. This guarantees that the two coordinate curves through each point of S are not tangent, but cross at a non-zero angle. I n addition, it guarantees the existence of a well-defined tangent plane at each point of the surface. For the tangent plane is the plane determined by the non-collinear tangent vectors xu and x . Further evidence: the image of a curve u = u( t ), v = v{t) in D is the curve x{t) = x[ u( t ), I ts velocity vector (tangent to the surface) is dx du dv dt X udt * v d t a vector which lies in the plane of xu and xv. Now the vector xu X xv plays an important role. I t is not zero because xM and xare non-collinear; it is perpendicular to the surface in the sense that it is perpendicular to xu and to x hence to the tangent plane. Therefore X u X X, n = |xtt X x| is one of the two unit normals to the surface. The way it points defines for us the top of the parametric surface. Let us return to our project of finding a reasonable definition for surface area. A small rectangle in D with sides du and dv maps to a small region on the surface. According to the formula dx xu du + x dv, this region is closely approximated by the parallelogram (Fig. 6.2) in the tangent plane with sides xMdu and x v dv. I ts area is (*) dA = |(xu du) X ( xv dv)\ = |xtt X xv| dudv. V 6. Applications of Integration 493 dv du Fi g . 6.2 approximation to area by a small parallelogram Recall that i j k y u z u Zu Xu Xu y u Xu X Xx> Xu Vu Zu = i + i + y v z v Zv X v Xv y v Xv y v Zv d(y, g) . d(g, x) . a(x, y) ^ d(w, >) 3(m, v ) * d(u, v) Consequently |X v x I2 = + ^ (2 z ) Y 4- ( d( x y) \ \ d ( u , v ) / \ d ( u , v ) / \ d ( u , v ) ) ' We substitute this into (*), then add up the small pieces of area by an integral. Thus we arrive at the following definition: Surf ace Area Let x = x(u, v) define a parametric surface with domain D. I ts surface area is A = J J \ xu X xv\ du dv D g) V , ( d( z9 x) \ * ( d(x, y) Y , , d( u, v ) J \ d ( u , v ) ) \ d ( u , v ) ) U V' There is one possible flaw in the definition. When we look at a surface, we see a definite area, quite independent of how we have described the surface. I f the 494 12. INTEGRATION THEORY same geometric surface has two different parametrizations, how do we know that we get the same area from the two corresponding integrals? This means that we have two plane domains D and E, each defining the surface S by differentiable functions x(u, v) and y (s, t) respectively (Fig. 6.3). Fi g . 6.3 a surface described by two different parametrizations Since both D and E are in one-one correspondence with S, they are in one- one correspondence with each other. Thus there is a transformation 0: from E onto D such that u = u(s, t) V = v(s, t) y (s, t) = x [ >($, t), v(s, 0 1
By the Chain Rule,
hence
ys xuzis ~I- X d^s, y t I- xt
d( u, v)
ys X ]ft (usvt VsW't)xwX Xfl= ^f ~ X xv
d(s, t)
We also assume that there is a definite top side of the surface, the same whether
6. Applications of Integration 495
> 0.
computed by means of x(u, v) or y (s, t ) . This means that xu X xvand ys X yt
point in the same direction, that is,
d(u, v)
d(s, t)
Hence 0 is &proper transformation, and
|y*x y<l = lxx xl-
d(s, t)
Now we can compare the D-area and the E-area of the surface, using the
change of variable formula:
A 0 = J J \ xu X x| du dv
d( u, v)
- / / 1*. X
a(s, t )
ds dt
- j f l. x ,l
ds dt = A e.
E E
The areas are the same, computed either way. The definition is OK !
EXAMPLE 6.3
Find the area of the spiral ramp x = (u cos v, u sin v, bv) corre
sponding to the rectangle D: 0 < w < a, 0 <^<c .
Solution: Although not necessary, it is nice to sketch the surface
(Fig. 6.4). Since
xu = (cos v, sin v, 0) and x = ( u sin v, u cos v, b),
the element of area is
496 12. INTEGRATION THEORY
dA =
/
sin v 0
2
0 COS V
2
cos v sin v
2
1 /
+ +
du dv
u cos v b b u sin v u sin v u cos v
V ^2
sin2 v + b2 COS2 V +
I
I
r
s
y / b 2 + u2du dv.
As (u, v) ranges over the rectangle, the point x(u, v) runs over the spiral
ramp. Hence
A = j j \ / b 2 + u2 du dv = \ / b 2 + u2 dv^j .
Answer: -
2
A non-parametric surface, the graph of 2 = /(z, ? /), is a special case of a
parametric surface, provided we think of x and y as parameters. The variable
point on the surface is x = (x, y , f (x, y ) ) . Then
~ = (1>0,/*), J - = (o, l ,/).
dx dy
Consequently
^ X T 5= (1, 0,/*) X (0, 1, A) = (-/* , 1),
dz dy
and the resulting formula for the element of area is
dA = V 1+ f* + fv dx dy,
a most useful expression.
The formula has a geometric interpretation. The unit normal to the surface
is
N =
1
v i +/x2+/2
I ts third component (direction cosine) is
(-/* , - f v , 1).
V i + /*2+ /*2
where 7 is the angle between the normal and the 2-axis. Thus
(cos 7) dA = dx dy,
which means that the small piece of surface of area dA projects onto a small
portion of the x, y-plane of area dx dy.
EXAMPLE 6.4
Find the area of the portion of the hyperbolic paraboloid (saddle
surface) z = x2 + y2 defined on the domain x2 + y2 < a2.
6. Applications of Integration 497
Solution: First sketch the surface, then the portion corresponding to the
range x2 + y2 < a2. See Fig. 6.5. By the formula,
dA = y / 1 + f x2 + fy1dx dy = V 1 + ( 2x )2+ (%/)2dx dy
= \ / l + 4x2+ 4y2dx dy.
498 12. INTEGRATION THEORY
Use polar coordinates:
A - J f V T + 4 r2r dr dd
f a ________ r ** 1
= / r y/ 1 + 4r2 dr dd = [ ( 1 + 4a2) 3/2
c/o y o
Answer: - [(1 + 4a2)3/2 1].
Green's Theorem
There is a useful connection between certain line integrals and double
integrals. Suppose D is a domain in the x, ? /-plane whose boundary consists of
one or several nice closed curves, curves consisting of arcs with continuously
turning tangent vectors. The counterclockwise sense of rotation in the plane
imparts a direction to each of the boundary curves (Fig. 6.6). We think of
walking around the boundary curves so that D is always on our left.
We use the symbol dD to denote the whole boundary of D, with this direc
tion imposed. Then we know the meaning of a line integral
/
OD
P dx + Q dy.
The following theorem says that this line integral over dD is equal to a certain
double integral over D.
6. Applications of Integrati on 499
Green' s Theorem Suppose P( x , y ) and Q( x, y ) are continuously
differentiable functions on the domain D. Then
We shall not give a complete proof of the theorem but only prove one special
case. The idea in general is to treat the theorem as two separate formulas,
J P dx = J J Py and J Q dy = J J Qx. For the first, cut the domain by
vertical lines into pieces, each of which is the region between the graphs of
two functions. (This is the case we shall prove in a moment.) Then add up the
results. The double integrals add up to the double integral over the whole
domain. The line integrals add up to the line integral over the boundary,
because the contributions from the vertical division lines cancel in pairs.
We shall prove
dD D
dP
dy
dx dy
for the domain D of Fig. 6.7.
The boundary dD consists of four pieces, and accordingly the line integral
decomposes into four summands. On the two vertical sides x is constant, hence
dx = 0, no contribution. Therefore
J P dx = J P\_x, h ( x ) ~ \ d x J P[_x, g(x)~\dx
d o 6
= I {-P[_x, h(x)~] +P[ x, g( x) ~] } dx.
J a
- - m s * ) * - - V S * *
This completes the proof.
Area Formula I f D is a plane domain with a smooth boundary, then
|D| = ^J y dx + x dy.
dD
Proof: Apply Greens Theorem with P = y and Q = x. Then
Qx Py = 2, so
j y dx + x dy = J J 2 dx dy = 2 |D|.
dD D
R e m a r k : By similar applications of Greens Theorem, we also have
|D| = J xdy = J y dx.
d o o d
The boxed formula is often more convenient than either of these because it
has a certain amount of symmetry.
500 12. INTEGRATION THEORY
EXAMPLE 6.5
Find the area enclosed by the ellipse
4- t = 1
a2 b2
Solution: Let D denote the domain bounded by the ellipse. Parame
terize the ellipse as usual by
x = a cos 0, y = b sin 6, 0 < 0 < 2t .
Then ^
lDl = - / - y d x + x d y
dD
1 f 2*
- / (b sin 9) ( a sin 6 dd) + (a cos 0) (b cos 6 dd)
2 J o
(ab sin2d + ab cos2d) dd = irab.
6. Applications of Integration 501
Answer: rob.
EXERCISES
Evaluate F(t) for t > 0 and then compute F' (t). Now compute F' (() by the Leibniz
Rule and compare your answers:
I . F(t) = J e~tx dx 2. F(t) = J ^arctan^dz, t > -
3. F(t) = J (t + x )ndx 4. F(t) = f x* dx, t > 1.
5*. Prove under suitable hypotheses:
d A< 0 A<o
- f ( x, t ) dx = h'(t)f[h(t)y] g'(t)f[g(t), t ] + / f t (x, t)dx.
dt Jo(t) 7^(0
Use the Chain Rule.]
Parametrize and set up an integral for surface area. Evaluate when you can:
6. sphere of radius a
7. lateral surface of a right circular cylinder of radius a and height h
8. lateral surface of a right circular cone of radius a and lateral height L
9. right circular torus obtained by revolving a circle of radius a about an axis in
its plane at distance A from its center (where a < A )
10. ellipsoid ^ + g + 5 = l
a1 b1 cz
[Hint: Use spherical coordinates and x = (a sin 0 cos d, b sin <t>sin 6, c cos 0).]
I I . (cont.) Reduce the double integral to a simple integral in the special case a = b.
Set up the area for the given non-parametric surface. Evaluate when you can:
12. z = xy; 1< x < 1, 1< ?/ < 1 (do not evaluate)
13. z = ax + by\ (x, y) in a domain D
14. z = x2 + y2) x2 + y2 < 1
15. z = V l x2y2', x2 + y2 < 1.
16. Find the area of the triangle with vertices (a, 0, 0), (0, 6, 0), and (0, 0, c).
17. Prove that / / N dA = 0 for a closed surface.
[Hint: Show that i J J N dA = 0, etc.]
18. From each point of the space curve x = x(s) draw a segment of length 1in the
direction of the unit tangent. These segments sweep out a surface. Show that its
area is \ Jk( s) ds, where k (s) is the curvature and the integral is taken over the
length of the curve.
502 12. INTEGRATION THEORY
19. (cont.) Interpret for a circle. Can you show that J, k( s) ds = 2ir for any closed
oval (convex curve) in the plane? If so, interpret Ex. 18 for ovals.
20. Let x = x (s) be a curve of length L on the unit sphere x = 1. Connect each point
of the curve to the origin, forming a surface. Show that the area of this surface
is \ L .
Let P and Q be continuously differentiable on a rectangle I.
21. Suppose (P, Q) grad0. Prove
/
^P dx -f- Qdy = 0
on every closed curve in I.
22*. (cont.) Prove the converse.
23. Evaluate J s y dx 4x dy over the boundary of the rectangle with vertices at
(0, 1), (6, 1), (6, 3), (0, 3) taken counterclockwise.
24. Evaluate J y 2ex dx + 2yex dy over the circle x2 + y2 = 1.
25. Prove Greens formula under suitable hypotheses:
/ ( uvy dx + uvx dy) I {vuy dx + vux dy)
do do
= j 'f \jll (VXx f~Vyy ) ^i^xx 1 Uyy ) ] dx dy.
D
26. (cont.) Test this formula when D is the unit disk, u 1, and v = In r. Explain
the result.
27*. Suppose P and Q are continuously differentiable functions on a rectangle I, and
dP/dy = dQ/dx. Prove there is a function f(x, y) on I such that df/dx = P and
df/dy = Q.
28. (cont.) Let a and b be points of I. Prove that the line integral / P dx + Qdy
^ a
is the same for all paths in I from a to b.
7. IMPROPER INTEGRALS [optional]
A multiple integral is improper if either the domain of integration is un
bounded or the integrand has singularities or both. There are always two
questions: Does the integral converge? I f so, what is its value?
The Whole Plane
First we define
R-
y ) dx dy.
7. Improper Integrals 503
Once we have this, it is easy to define the integral over any unbounded domain
D. The integrand will be a real-valued function f ( x, y) defined on the whole
plane R2. We cant expect to define the integral for all functions, so we shall
restrict attention to functions having the following reasonable property:
A function / on R2. is locally integrable if / is integrable on each bounded
R-measurable set D.
We shall define the integral by means of limits of the type
/ dx dy,
lim//'
n->oo J J
where {Dn} is an expanding sequence of bounded R-measurable sets, Di CI
D2Q D3Q that in some sense fill out the plane.
Approxi mat i ng Sequence A sequence {Dn} of bounded R-measurable
subsets of R2 is an approximating sequence provided
(1) Di C D2C D3C
(2) if B > 0, then there exists an n such that the disk of radius B and
center 0 lies in Dn, that is,
{x I |x| < B) C Dn.
Thus each disk centered at 0, no matter how large, is eventually covered by
the sets of any approximating sequence. I n fact, any bounded set is eventually
covered, because a bounded set is contained in some disk centered at 0.
Now we have enough preliminary definitions to define the integral.
Int egral on R2 Let / be a locally integrable function on R2. Then / is called
integrable on R2 provided:
(1) For each approximating sequence {Dn}, the limit
lim
n-* oo
exists.
(2) These limits all have the same value, which is then called
J J f d x d y .
R2
/ /
f dx dy
I n every other case, we say
J J f dx dy
R!
504 12. INTEGRATION THEORY
diverges. One common case of divergence occurs when
for each approximating sequence {Dn}. Then we write
J J f dxdy = + oo.
The preceding definition has the drawback that it is impossible to test
every approximating sequence. However, when the integrand has constant
sign, it is enough to test any one approximating sequence.
Theorem 7.1 I f / > 0 on R2 and
exists (finite or + 00) for one single approximating sequence {Dn}, then /
is integrable on R2.
Proof: Take any other approximating sequence {Em}. For each m,
the set Emis bounded. Hence there exists an n such that EmC Dn. Since
f > 0, we conclude
Now we can apply the same argument with the D' s and Es interchanged.
The result is
so equality follows. Since {Em}is an arbitrary approximating sequence, the
proof is complete.
From this, we conclude that the increasing sequence
converges, possibly to + oo, and
7. Improper Integrals 505
There is a related result about absolute convergence. We shall state it, but
save the (somewhat technical) proof for the exercises.
Theorem 7.2 Suppose
(1) / i s locally integrable.
(2) |/| is integrable on R2.
Then / is integrable on R2.
The point is that the test of Theorem 7.1 can be applied to | /|.
EXAMPLE 7.1
Prove that the integral
J J e- x2- y2dx dy
R2
is convergent and evaluate it.
Solution: The integrand is positive so we can use any convenient ap
proximating sequence. We choose disks,
Dn = {x | |x| < n},
and switch to polar coordinates:
J J e~x2~y2 dx dy = J J e~r2r dr dd = ^ J r e~ *^ )
Dn Dn
= (2tt) ^ (1 e-" 2) -------- >7T.
Answer: ir.
This result has a surprising application.
EXAMPLE 7.2
Evaluate
f e d x .
Solution: We solve Example 7.1 again, this time using a sequence of
squares:
En = {(x, y) | \x\ < n, \y\ < n) .
Now
J J e- x 2- y2 ^ = ^ J e- x 2 d x ' j ^ J e~y2dy^=(^ J er *2dx'j .
We know from Theorem 7.1 and Example 7.1 that
506 12. INTEGRATION THEORY
I f ' * '
dx d y -------- 7r.
We conclude
( - * )
An si ver : y / 1r.
R e m a r k : I n probability theory, the function
is important; it is the density function of the normal distribution. The simple
change of variable x = u/ y/ 2 in Example 7.2 yields
/ " ./ (X
</>(x) = 1.
The graph of </>is the familiar bell-shaped curve (Fig. 7.1); the area under the
curve is 1.
1"
y = <t>(x) = = e~x*12
y / 2 v
Fi g . 7.1 normal distribution
Ar bi t r ar y Unbounded Domains
Now let D be an arbitrary unbounded subset of R2. We wish to define
/ / / < * ,
y ) dx dy
To do so, we extend / to the whole plane by making it 0 outside of D. Thus
7. Improper Integrals 507
we define
Then we set
F( x, y ) =
f ( x, y ), (*, y) D
0, (x, y) $D. J J f ( x, y) dxdy = J J F{x, y) dx dy D R2 if the right-hand side exists. For the domains that occur in practice, things go routinely from this point. R e m a r k : Actually, it is reasonable to restrict attention to domains D for which the characteristic function kD (one on D, zero off) is locally in tegrable. This means that the intersection of D with any bounded R-measurable set is itself a (bounded) R-measurable set. Before reading the next example, it is advisable to review Theorem 8.2, p. 110. EXAMPLE 7.3 Evaluate f 00eax e~~bx 1 dx, 0 < a < b. Jo * Solution: The starting point is the observation e~ax e~ -bx /*& = e~x J a dy. Therefore it is natural to consider - I I - xy d x d y , D = {(x, y) \ x > 0, a < y < b}. To evaluate I, take the rectangle = {(x, y) | 0 < x < n, a < y < b}. On the one hand, J J e~xv dxdy = J ^ J e~xy dy'j dx Dn 0 o f n e~ax e~bx f = / ------ -------d x --------- > I J O X J 0 J J e~xy dxdy = J so -ax _ pbx dx. 508 12. INTEGRATION THEORY On the other hand, f f e~xy d x d y = [ ( [* d x ) dy - I ----- dy. J J J a V o / J a y We infer from this that / */o dx = lim f - oo . / a dy. The trick now is to take the limit under the integral sign on the right. That is where Theorem 8.2, Chapter 3 comes in. We observe that 1 - e~ny uniformly on [a, 6] because 1 e~ny 1 y y Q~ny e~na y ~ a 0 as n -> 00 . Therefore by Theorem 8.2, 1gex _ gbx '0 / ' X (b i 6- ^ 7 [ b dy ^ b = / l i m-----------dy = / = In - . J a n-oo V J a V a Answer: In (6/a). Singul ari ties Suppose there is a single point in the (bounded) domain of / at which / becomes infinite, or is indefined in some other way. We try to squeeze down on this isolated singularity of / by a growing sequence of subdomains, in the same way that we expanded to infinity in the case of the whole plane. I f we limit ourselves to functions of constant sign, a result similar to Theorem 7.1 says that one Such sequence suffices. The following example should illustrate the idea. Recall the integral n d x J o * p which is convergent for p < 1 and divergent for p > 1. EXAMPLE 7.4 Find all p for which / / ; , < & % r<a is convergent. ( r2= x2 + y2) Solution: The integrand is positive, except for a singularity at r = 0 when p > 0. I n polar coordinates, // h ixdy- ! // 7. dxdy- f de /' -21f! 7. Improper Integrals 509 < r < l Answer: For p < 2, it converges; for p > 2 it diverges. It erati on The main point about iteration of improper integrals is that it often doesnt work. Be careful; there are few useful results, and generally each case must be treated on its own merits. I t is easy to get incorrect results. EXAMPLE 7.5 Apply iteration blindly, both ways, to o<sx<i (x2 + y2) 2 0<y<,\ dx dy. Solution: The formula r 2 / dx (x2+ a2) 2 x2 + a2 is valid, as is easily seen by differentiating the right-hand side. Therefore / = r ( r ~x2 ~ y2 d i ) dy = r ( ~x ) | 1% J o \ J o ( x2 + y ) ) J o v 2 + y ) 11=0 Now look carefully at the integrand. I f x and y are interchanged, it changes sign. Therefore integration first on y , then on x, leads to I = +j 7r, an im possible situation. What is wrong is that the improper double integral does not exist. This example is worth careful study. To end on a happier note, we give an example where the result is correct, but it is beyond our technique to justify the formal steps. (We shall use integral 41 inside the back cover of the book.) Set / / 0 < x 0 < y e~xy sin x dx dy. 510 12. INTEGRATION THEORY First, Second, c / r \ c _p~xv x~ dy / // * \ f K e~xy I I I e~xy sin xdx) dy = / -- - ( y sin x + cos x) J o V o / Jo 1 + 2/2 ' I" i x=0 + y2 2 Therefore f - si r J 0 This result is correct, but this derivation is full of holes. sina: 7 7r dx = - . x 2 EXERCISES Prove for integrals over R5 >2. f f dxdy _ o f f _ dxdy _ _2 ' J J i + x* + y> ^ J J l + ^ + y2+ xV r ' 3. Suppose 0 < g( x, y) < f(x, y) on R2and j j f < 00Prove that ^ 0 < 00 4*. Prove Theorem 7.2. [Hint: Use a Cauchy Condition.] 5*. Prove divergent: j j _dx dy 6. Prove convergent: f i r + x V ' dx dy R- - + X4 + y*' Prove convergent and evaluate: 7. JJ (In r) dxdy 8. JJ (1 - r) dx dy, p > - 1 r<> 1 r<a 9. j j j p8dx dy dz, s > 3 10. j j j (In p) dx dy dz P< 1 ///<- p<i p^i 11. JJJ (1 p)* dx dy dz, s > 1. p<l 12. Prove convergent for p > 1and evaluate: 8. Numeri cal Integrati on 511 dx dy r 2p r > 1 13*. (cont.) Prove 1 I f ' I < 00 (m2+ n2) p for p > 1. The sum is taken over all integers m, n except (0, 0). Compute: 14. j er txl dix, t > 0 15*. J xLe~x2dx. 8. NUMERICAL INTEGRATION [optional] I n this section we discuss one method for approximating double integrals. I t is an extension of Simpsons Rule. Let us recall Simpsons Rule. To approximate an integral f f ( x) dx, J a we divide the interval a < x < b into 2m equal parts of length h: b a a = x0 < Xi < x2 < < x2m = b, h = , 2m and use the formula 2m f b h \ \ J f ( x) d x t t ~ 2 ^ Bi f ( xi ) , where the coefficients are 1, 4, 2, 4, 2, 4, 2, , 2, 4, 1. We extend Simpsons Rule to double integrals in the following way. To approximate JJ f ( x, y ) dx dy, I where I denotes the rectangle a < x < b and c < y < d, we divide the z-interval into 2m parts as before and also divide the ^/-interval into 2n equal parts of length k : d c C = 2/0 < yi < V2 < < yin = d, k = . 512 12. INTEGRATION THEORY We obtain (2m + 1) (2n + 1) points of the rectangle I, shown in Fig. 8.1. The Rule is / / 2m 2 n hk V V f i x, y) dxdy ^ y^ 9 i =0 j =0 where the coefficients A are certain products of the coefficients in the ordinary Simpsons Rule. Precisely, A ij BiCj, |8 4 21 n 1 !16 X q 1 X i Xi X 3 x4 2' xb ^ordinary Simpsons Rule coefficients Fi g . 8.2 8. Numeri cal Integration 513 where B0, Bh , B^m are the coefficients in the ordinary Simpsons Rule 2m J p( x) dx ^ ^ Bi p( xi ) , a 1=0 and Co, Ci, , C2n are the coefficients in the ordinary Simpsons Rule 2 n J q(y) d y t t ^ ^ Cj q{yj ). j = 0 In Fig. 8.2, several of these products are formed. Since Bi and Cy take values 1, 2, and 4, the coefficients A ,7 take values 1, 2, 4, 8, and 16. The Ai j can be written in a matrix corresponding to the points (xi, yj ) as in Fig. 8 .2. For example, if m = 3 and n = 2, the matrix is "1 4 2 4 2 4 1 4 16 8 16 8 16 4 2 8 4 8 4 8 2 . 4 16 8 16 8 16 4 1 4 2 4 2 4 1 EXAMPLE 8.1 Estimate / / (x + y )3dx dy by Simpsons Rule with m = n = 1. Compare the result with the exact answer. Solut ion: Here h = k = i. The coefficient matrix is [ Ai j ] 1 4 1 4 16 4 -1 4 !. Write the value (Xi + i/y) 3 in a matrix: [(*< + 2/j)3] = (1 + 0)3 ( - 0* (1 + 1)* 6+) G +0 0 +i ) (0 + 0)3 H ) (o + 1)3 1 s ' 1 27 8 8 0 - 1 514 12. INTEGRATION THEORY Now estimate the integral by 9 1 1 Ai j { xi + y i yi ) * i =0 j =0 To evaluate this sum, multiply corresponding terms of the two matrices and add the nine products: M l 27 1 27 1 1-1 + 4- + 1-8 + 4-- + 16-1 + 4- + 1-0 + 4-- + 111 1 T 27 1 27 1 ] 54 3 = 36L1 + Y + 8 + 2+16+Y + 2+1J = 3 6 " 2 ' The exact value is ( x + y Y dx dy = I'm : if ( X + y Y d x ) dy 0 < x , y ^ l ) 4 Jo l ( y + l ) 4 - y4] dy = l ( y + l ) 6 - 2/5] 0 ^ 1 _ 30 _ 3 o 20 ~ 2 Answer; Simpsons Rule gives the estimate , which is exact. R e m a r k 1: Because Simpsons Rule is exact for cubics, the double integral rule is exact for cubics in two variables. (See exercises below.) R e m a r k 2: The matrix of values [/(, yi ) ] is arranged to conform to the layout of points (xi, yj ) in the plane (Fig. 8.1). The next example is an integral that cannot be evaluated exactly, only approximated. EXAMPLE 8.2 c c Estimate / / sin(zt/) dx dy, using m = n = 1. 0 x,y ^tt/2 Solut ion: Here h = k = 1r/4, and the coefficient matrix is "1 4 1" z a <h = 4 16 4 1 4 1 8. Numeri cal Integration 515 The matrix of values of sin(xy) is ^ r T sin 0 sin sin 8 4 [sin xi yf] = T . TT sin 0 sin sin 16 8 sin 0 sin 0 sin 0 The estimate is sin (#2/) dx dy 7r __ 144 ( . TC . TT 16 sin + 8 sin 0 < x,y <7r/2 Error Estimate The error estimate for Simpsons Rule in two variables is analogous to that in one variable: where a4/ < M and d4f dx4 dy4 We omit the proof. EXAMPLE 8.3 Estimate the error in Example 8.2. Sol ut i on: d4 . . a4 - (sin xy) = y4 sin xy , - (sin xy) = x4 sin xy. dxr dy4 But |sin xy\ < 1. Hence in the square 0 < x, y < ir/ 2, Apply the error estimate, with m = n 1, h = k = 7r/4, and M = N = (tt/2)4: |error| < (6 - a) ( d - c) 180 [ h4M + fcW], 516 12. INTEGRATION THEORY EXERCISES 1. Use tables to complete Example 8.2. 2. Do Example 8.2 using m = n = 2, and compare your estimate to that of Ex. 1. Also estimate the error. 3. Suppose f ( x , y ) = p( x) q( y) . Show that the double integral Simpsons Rule estimate is just the product of the Simpsons Rule estimate for J p (x) dx by that for / q( y) dy. 4. (cont.) Conclude that the rule is exact for polynomials involving only xsys, xsy2, x2yz, x*y, x2y2, xy3, and lower degree terms. 5. The analogue of the Trapezoidal Rule is / / f ( x , y ) d x d y ~ ^ [ / ( 0 , 0 ) + / ( 0 , l ) + / ( l , l ) + / ( l , 0 ) ] . 0 <x, y <1 Show this rule is exact for polynomials/^, y) = A + Bx + Cy + Dxy. 6. (cont.) Find the corresponding rule for a rectangle a < x < b , c < x < d , divided into rectangles of size h by k with h = (b a) / m and k = (d c)/ n. 7. (cont.) Let I denote the unit square 0 < x, y < 1. Suppose f(x, y) = 0 at its four vertices. Prove that I J f ( x , y ) d x d y = ^ J J y ( l ' i i - - i f ' . * 1-1 - y) f w( . x, y) dxdy x ) [ f * x( x , 1) + f x x ( x , 0)] dx. 8. (cont.) Suppose also that \fxx\ < M and \fm\ < jVon I. Prove that Iff- 9. (cont.) Conclude that for any function/ ( x, y), the error in the trapezoidal estimate (Ex. 5) is at most (M + N ) / 12. [Hint: Use the result of Ex. 5 and interpolation.] 10. (cont.) Suppose f ( x , y ) is defined on 0 < x < h, 0 < y < k, and / = 0 at the vertices of this rectangle I. Suppose also \fxx\ ^ M and \fyy\ < N. Deduce that Iff f(x, y) dx dy hh < ~ (h*M + k2N). iZ 11. Extend Simpsons Rule for double integrals to triple integrals. 12. (cont.) For which polynomials in 3 variables is the extended Simpsons Rule exact? 13. (cont.) Estimate the integral of sin(xyz) over the cube 0 < x, y, z < 1, dividing the cube into 8 equal parts. Carry your work to 3 places. 13. Differential Equations 1. INTRODUCTION We are familiar with differential equations of the type dx u \ d i ~ m - Each solution is an antiderivative of f ( x ) . We are also familiar with the differential equation dy Tx = y ' each solution of which is a function y = cex. These differential equations are instances of the most general first order differential equation, y = q ( x , y ), dx where q(x, y) is a function of two variables. First order refers to the presence of the first derivative only, not the second or third, etc. A differential equation dy / dx = q(x, y) defines a direction field. At each point (x, y) of the domain of q(x, y) , imagine a short line segment of slope q(x, y) , as in Fig. 1 .1 . A solution of the differential equation is a function V = y(%) whose graph has slope matching that of the direction field (Fig. 1.2). The most important problem in this subject is the initial-value problem: Ini tial-Value Problem Find the solution y y( x) of the differential equation dy ( = q(x, y) dx whose graph passes through a given point (a, b), that is, which satisfies the initial condition y( a) = b. 518 13. DIFFERENTIAL EQUATIONS It is shown in advanced courses that if q(x, y ) is a reasonably behaved function, then the initial-value problem has a unique solution locally. That is, there is a neighborhood of a on which there exists one and only one differen tiable function y{ x) satisfying y' { x) = x, y{x)~\ and y ( a) b. This is true, for instance, if q(x, y) has continuous partial derivatives in a neighborhood of (a, b ) . In Section 7, we shall discuss one method of proving such results, successive approximation. \ \ \ \ \ \ \ \ \ ' , \ \ \ \ \ \ \ \ ' . \ \ \ \ \ \ \ \ \ ' . \ \ \ \ \ \ \ \ \ ' \ \ \ \ \ \ \ ' S. \ \ \ \ \ \ N ' V - V-V -^7- \ \ \ \ \ ^ \ \ \ \ - - - - - -- \ \ v. - - - - ^ \ \ ------ ---- / ^ y y / / \ \ \ \ \ - - - - / / ------- / / / " ^ / / / ' y / / / 7* X / / / / / / / / / / / / / / / / / / / / / / I I / / / / I I / / / / I / / I I F i g . 1.1 F i g . 1.2 \\ A\ \ > \ \ \ \ \V\ \ N < \ \ \ \ \ \ A \ \ V\ \ / / y \ \ \ \ \ \ > s. \ X \ \ \ \ \ \V > V\ V / \ \ \ \ \ \ \ c- / / / \ \ \ \ \ \ / / / \ V / / / fc \ \ \ \ / / r X \ \ \ / / / // / \ \ \ X-/ / / / // / \ \ s / / / / / // / \ s / / / / / / / // / / f / / / / / / / / ' f / / / / y = y(*) 1. Introduction 519 Figure 1.3 shows two solutions of dy dx = Q(x, y ), one satisfying the initial condition y( 0) = 0, the other satisfying the initial condition 2/(1) = 1. Fi g . 1.3 Usually we first find the general solution to the differential equation, ignoring the initial condition. Somewhere along the line we are forced to integrate, thus introducing a constant of integration. We then select the constant so that the initial condition is satisfied. EXAMPLE 1.1 Find a solution of the initial-value problem < M I I S a l H * 8 1 ^ 3 U ( - 3 ) = 1. Sol ut i on: Integrate: y( x) = - x3 + c. 520 13. DIFFERENTIAL EQUATIONS This is the general solution of the differential equation. Now select the constant by substituting the initial condition: l = i ( - 3 ) 3 + c, c = 10. Answer: y{x) = - x z + 10. o EXERCISES Solve the initial-value problem: i dx x 7 . 1 - , , ,(1) 10 9. ^ = a: + sin y(0) = 0 1 1 .$ =
dX y / t f + 1
dy
y(0) = o
2 . ^ = x* + x - 5 , 2/(0) = - 2
A. cos 2x, y( r) = 0
ax
6. ^ = ze**, 2/(0) = 1
8 . f x = y, 2/(1) = 0
dy
13. - g = (1 + x2)(l + x3), 2/(0) = 0.
14. Sketch the direction field of dy/ dx =
(0, 1); through (0, 1). Rewrite the
15. Solve xy' + y = 2x.
[Hi'ntf: Compute (xy)'.]
17. Solve x + yy' = z2+ y2-
[Hint: Compute the deriva
tive of In Or2+ y2).]
=i/2. Draw the solutions passing through
equation as dx/ dy = y~2 and solve.
16. Solve y' + y = e~x sin x.
[Hint: Compute (exy)'.]
18. Solve xy' y = x2 sin x.
[Find the trick yourself this time.]
2. SEPARATION OF VARIABLES
The general first order differential equation is
2. Separation of Variables 521
In this section, we study the special case in which q(x, y) = f ( x) g( y) ,
the product of a function of x alone by a function of y alone. Examples are
dy 2 3 dv dv v
= x 2y 3, = ex cos y, = ..... -
ax ax ax _|_ x2
Not included in this special case are
dy dy . , K dy /--------------
^ ~ V l + * + *,
since in each of these equations the right-hand side is not of the form
f ( x) g(y) .
Solve
^ = f ( x) g( v)
by first separating the variables, putting all the y on one side and all
the ltx on the other. Write
= /(*) dx.
g(y)
Then integrate both sides (if you can):
G(y) = F(x) + c,
where G(y) is an antiderivative of 1 / g( y) and F(x) is an antiderivative of
f i x) . If this equation can be solved for y in terms of x, the resulting function
is a solution of the differential equation. (This technique will be justified
after several examples.)
Note that in separating the variables you must assume that g(y) 0.
If g(y) = 0 for 2/ = 2/o, it is simple to check that the constant function
y( x) = ?/o is a solution of the equation.
EXAMPLE 2.1
a i dy
Solve = xy.
dx
Sol ut i on: Obviously y( x) = 0 is a solution. Now assume y( x) ^ 0
and separate variables:
i y . x i x , ( * - ( . * ,
y
In \y\ = i x2 + c,
\y\ = kex2/2 (k = ec)j
where the constant k = ec is positive. If y > 0, then \y\ = y, if y < 0, then
|2/| = y. Hence
f kex2,i if y > 0
y j jfce-V* if y < 0 .
522 13. DIFFERENTIAL EQUATIONS
This is equivalent to
where a is any non-zero constant.
y = aex2/2,
Answer: y aex2/2, a any constant.
(Note that the solution y( x) = 0 is
included in this answer.)
Check:
= (aex2/2) = axex2/2 = xy.
dx dx
EXAMPLE 2.2
dy 2x( y - 1)
Sol ve* ' ^ + r '
Sol ut i on: Obviously y( x) = 1 is a solution. Now assume y{x) 1
and separate variables:
dy 2x dx: C dy f 2x dx
r J y - l = J x 2 + l y - 1 x 2 +
In \y 1| = ln(x2 + 1) + c,
12/ 1| = e(x2 + 1) = k( x2 + 1).
Therefore
y - 1 = k ( x 2 + 1),
where k is a positive constant. The answer may be written
y = 1 + a(z2 + 1),
where a is an arbitrary constant. This includes the special solution y{x) = 1.
Ans wer : y = 1 + a(z2 + 1).
Check:
dV o o a(*2 + x) 2x(2/ - 1)
_ = 2ax = 2x = 2- , 1 - *
dx x 2 + 1 x 2 + 1
EXAMPLE 2.3
2. Separation of Variables 523
Sol ut i on:
= exe~y, ey dy = ex dx, ey = ex + c.
dy
dx
Answer: y = ln(e* + c), ex + c > 0.
Check :
ex + c
ex
= ex~v.
ey
These examples show that the me.thod works. Why does it work?
Suppose y = y( x) satisfies
Let G{y) be any antiderivative of 1 / g( y) . Note that G(y) is indirectly a
function of x. By the Chain Rule,
Hence G(y) is an antiderivative of f (x). Therefore, if F(x) is any anti-
derivative of J( x),
If you can solve this equation for y as a function of x, you have the general
solution. If not, then at least you have a relation between x and y, often
adequate for applications.
EXAMPLE 2.4
(5y 4 1) dy = 2x dx> y h y = x 2 + c.
We cannot solve this fifth degree equation for y ; it is hopeless. Still we can
substitute the initial data:
0 = 1 + c, c lj
y b y = x 2 1.
This relation between x and y is as far as we shall get with the problem.
But it is adequate for a graph and can be used to calculate y (given x) to
any degree of accuracy.
t x " f ( x) g( y )
dG _ dG dy _ _ 1 _
dx dy dx g(y)
G(y) = F( x) + c.
Solve the initial-value problem
Sol ut i on:
Answer: y & y = x 2 1.
524 13. DIFFERENTIAL EQUATIONS
EXAMPLE 2.5
dy
Solve the initial-value problem = x 2y 2, y{ 0) = b.
Sol ut i on: Obviously y( x) = 0 is the solution if b = 0. Assume y ^ 0
and separate variables:
- l - Z + c.
v y 3
Substitute the initial data:
i = o + c = c.
b
Hence
1 x 3 1 bx3 3
y 3 b 3b
3b
Answer: y - ------- (The answer
3 bx3
includes the solution y(x) = 0.)
EXAMPLE 2.6
Find all curves with the following geometric property: the slope
of the curve at each point P is twice the slope of the line through
P and the origin.
Sol ut i on: Translate the geometrical property into an equation.
Suppose the graph of y{x) is such a curve and P = (x, y) a point on it. The
slope at P is dy/ dx. The slope of the line through P and (0, 0) is y/ x. Hence,
the geometrical property is expressed by the differential equation
dy = 2 v
dx x
Obviously y( x) = 0 is a solution. Now assume y( x) ^ 0 and separate vari
ables :
= 2 ) In \y\ = 2 In \x\ + c = In x 2 + c.
y x
Take exponentials, setting k = e\
\y\ = k x \
As in Example 2.1, it follows that y = ax2, where a is any constant.
Answer: The curves are the
parabolas y = ax2 and the line
y = 0. See Fig. 2.1.
2. Separation of Variables 525
Fi g . 2.1
EXAMPLE 2.7
Find all curves whose slope at each point P is the reciprocal of the
slope of the line through P and the origin.
Sol ut i on: The problem calls for all functions y{x) satisfying
dy x
dx y
Separate variables:
y dy = x dx,
1 2 1 2 .
2 V 2 +C'
Answer: The curves are the rectangular
hyperbolas y 2 x 2 = c. See Fig. 2.2.
Note the special case c = 0, y = x .
526 13. DIFFERENTIAL EQUATIONS
Solve:
EXERCISES
x dy = & 2 =
dx y dx xy
- dy
5. -y- = xe2'
dx
dy = ( 1 + y\ 2
dx yi + x j
(? )-
Show that the substitution y = ux changes the differential equation into an equation
for u = u(x) whose variables separate, and solve:
dy = x2+ 2/2 8 = ^ + V
dx 2x2 dx x y
Solve the initial-value problem:
9- S - - S ' (1>- 3 10.g-2co.,,
- D ( - 2) , ( 4) , 0
cfc a:
12. ^ = y3 sin x, y(0) = ^
Solve the given differential equation; interpret the equation and its solution
geometrically:
13. d J * = - y - 1 4 . ? = - -
dx x dx y
1 5 . ^ - * -
dx x
3. LINEAR DIFFERENTIAL EQUATIONS
A differential equation of the form
t - + p( x) y = q(x)
dx
is called a first order linear differential equation.
The term linear can be described in the following way. Imagine a
black box or processor that converts a function y( x) into a function
w( x) , as in Fig. 3.1.
input output
F i g . 3.1
3. Linear Di fferenti al Equations 527
The black box is called linear if from a linear combination* of inputs, it
produces the same linear combination of outputs (Fig. 3.2):
---------> Wi(x)
yi ( x)
2 / 2 (x)
ci yi (x) + c2y2(x)
- > w2(x )
- C i Wi (x ) + C2W2(x ).
Fi g . 3.2
In particular, the black box that converts y into dy / dx + p ( x ) y is linear:
when the input is ciyi + C22/2, the output is
~ (c<yi + C22/2) + p( x) ( cxyi + C22/2)
dx
= ci + p ( x ) y 1^ + C2 + p ( x ) y = ci(outputi) + C2(output2).
This is why the differential equation ( dy/ dx) + p ( x ) y = q( x) is called linear.
R e m a r k : The black boxes discussed here are nothing else than the linear
functions of linear algebra, in an appropriate setting. (See Ch. 7, Sec. 1.)
For example, let D be the set of all differentiable functions y ( x ) . Then the
black box defined by Ly = y' + p ( x ) y is a linear function defined on D
since
L( ayi + by2) = aLyx + bLy2.
Not every black box is linear, however. For instance, consider the black
box in Fig. 3.3.
y
black
box
dy
dx
F i g . 3.3
This box is not linear since the output of a sum is not the sum of the outputs
(Fig. 3.4).
2/1-
2/2*
2/i + 2 / 2
dm
dx
fdy2
dx
Fi g . 3.4
* An expression aX + bY + cZ + , where a, b, c, are numbers, is called a
linear combination of X, Y, Z, .
528 13. DIFFERENTIAL EQUATIONS
Figure 3.5 shows another non-linear process.
2 / i -
2/2
+ 2/2'
, 2
3 - + x y ?
dx
+ x y 22
T ( y 1 + 2/2) + x( y 1 + 2/2)2
dx
>4 Propert y of Li near Processes
An extremely useful property of linear black boxes is shown in Fig. 3.6.
F i g . 3.6
If the output of 2/2 is 0, then the output of 2/1 + 2/2 is the output of 2/1
Hence output is unchanged when input is augmented by a function which
is converted to zero.
This simple fact is the basis for a systematic study of linear differential
equations. We return to the process that really interests us, the one that
converts y( x) into dy / d x + p( x) y. For brevity, we write
+ p( x) y = Ly,
dx
indicating that Ly is the output corresponding to the input y. The differ
ential equation in question is
~ + p( x) y = q(x),
dx
or Ly = q.
A solution is an input y( x) that results in the given output q(x).
Suppose we can find a solution 2/1. That means Ly i = q. If z is any
function for which Lz = 0, then L( y i + z) = q. Thus 2/1 + z is also a solu
tion. Furthermore, every solution must be of the form 2/1 + z. To see why,
assume 2/1 and 2/2 are solutions:
Ly i = q, L y 2 = q.
Consider the difference yi y 2. Because L is a l i near process,
L( y i - y 2) = Lyx - L y 2 = q - q = 0.
Therefore, when the input is yi y 2, the output is 0; this says the differ
ence of two solutions is one of the functions z.
3. Linear Di fferenti al Equations 529
Any two solutions of the differential equation
~-x + p( x ) y = q(x)
differ by a solution of the homogeneous equation
^ + pf r ) y = o.
Therefore, the general solution of
~ + p( x) y = q{x)
is
u(x) + z(x),
where u(x) is any solution of the equation and z{x) is the general solution
of the associated homogeneous equation.
Because of this analysis, the differential equation (often called the non-
homogeneous equation)
+ p( x) y = q(x)
is solved in three steps:
(1) Find the general solution of the associated homogeneous equation
% + - -
(2) Find any solution of
j - + P( x) y = q{x).
ax
(3) Add the results.
In the next two sections we shall treat steps (1) and (2) separately.
EXERCISES
Suppose the input to a black box is y(x) and the output is one of the following.
Determine in each case whether the black box is linear or non-linear:
1. J^ y(x) dx 2. y(x) + x
3- S + 6 s + 4' 3w
5. In \y(x)\ 6. y(x + 1).
7. What is the general solution of the differential equation dy/ dx x2? Is your
answer a particular solution plus the general solution of the associated homoge
neous equation?
8. A big black box consists of two linear black boxes in series:
530 13. DIFFERENTIAL EQUATIONS
Is the big box linear?
4. HOMOGENEOUS EQUATIONS
Let us solve the homogeneous equation
+ - o.
Obviously y( x) = 0 is a solution. Assume y{x) 0 and separate variables:
= - p ( x ) dx, In \y\ = - P ( x ) + c,
y
where P( x) is any antiderivative of p( x) . Now exponentiate both sides,
setting k = ec:
\y(x)\ = ke~p(>x).
Hence,
/ N ( ke~p{x) if y > 0
y{ ) ~ ) - h e r if y < 0.
These facts are summarized in the following statement.
4. Homogeneous Equations 531
The general solution of
dy
Tx + v(-x)v ~ 0
is
y( x) = ae~p(^x\
where P( x) is any antiderivative of p( x) and a is a constant.
EXAMPLE 4.1
dy
Find the general solution of - b (sin x) y 0.
dx
Sol ut i on: The equation is of the form
~ + p( x) y = 0,
dx
where p( x) = sin x. An antiderivative of p( x) is P( x) = cos x. By the
formula, the general solution is
y( x) = ae~p(x) = aeCOBX.
Ans we r : y( x) = aecoax.
EXAMPLE 4.2
Solve the initial-value problem y' + 3y = 0, i/(l) = 4.
Sol ut i on: In this case p( x) = 3. An antiderivative is P{ x) = 3x and
the general solution is
y( x) = ae~'Zx.
To choose the correct constant, substitute x = 1, y = 4:
4 = ae~3j a = 4e3.
Ans wer : y( x) = 463<1~*}.
Solve the initial-value problem ~ + - y = 0, y( 1) = 2.
dx x
Sol ut i on: Since p( x) = 1/a: is not defined at x = 0, the solution may
behave badly near x = 0. Therefore consider only x > 0.
An antiderivative of 1/ x is In x; the general solution is
y( x) = ae~Xnx = -
x
(which---- > oo as x ---- >0). To choose the correct constant, substitute
x = 1, y = 2:
532 13. DIFFERENTIAL EQUATIONS
1
Answer: y( x) = x > 0.
x
EXERCISES
Solve:
l . ? + 3^ = 0
ax x
3. + V tan 0 = 0
5. x + (1 + x)y = 0.
Solve the initial-value problem:
6. + xy = 0, 2/(0) = - 1
8. ^ + J/ V x = 0, y(0) = 2
10. Solve y" y' = 0 by substituting = y'.
dx
2. L + Ri = 0, L, R constants
at
1 dy y
*2x dx x2 + 1
= 0
7. (1 - x2) + xy - 0, y(0) = 3
9 - 1 + 1 - 0 .
5. NON-HOMOGENEOUS EQUATIONS
We now consider the non-homogeneous equation
^ + P<2/ = qi x).
By our analysis of the problem, we need only one particular solution of
this equation, to be added to the general solution of the associated homo
geneous equation. We shall discuss two methods for finding a particular
solution.
METHOD 1. GUESSING.
This works particularly well when
(a) p( x) is a constant, and
(b) q{x) and all of its derivatives can be expressed in terms of a few
functions. For example, a quadratic polynomial and its derivatives can be
expressed in terms of 1, x, and x 2. The function sin x and its derivatives
can be expressed in terms of sin x and cos x. Each of the functions
ex, xex, x 2exj x cos x, ex sin x
has a similar property; it and all its derivatives involve only a few func
tions. However, 1/ x is not of this type; each of its derivatives involves a
different function.
EXAMPLE 5.1
dy
Find a solution of ------y = x2.
dx
Sol ut i on: The right side of the equation is a quadratic polynomial.
Guess a quadratic polynomial
y( x) = A x 2 + Bx + C,
and try to determine suitable coefficients A, B, C. Substitute y( x) into the
differential equation:
(2Ax + B) - ( Ax 2 + Bx + C) = x 2,
- A x 2 + (2A - B) x + (B - C) = x2.
Equate coefficients:
A = 1, 2A - B = 0, B - C = 0,
from which
A = 1, B = - 2 , C = - 2 .
i n s w ; \$/(x) = (a;2 + 2x + 2)
is a solution.
EXAMPLE 5.2
dy
Find a solution of ------y e3x.
dx
Solut i on: Try y = Ae 3x. The equation becomes
3 Ae 3x Ae 3x = e3x.
Hence,
2 A - l , A - 1 - -
5. Non-Homogeneous Equations 533
Answer: y( x) = ^ e3x is a solution.
A
EXAMPLE 5.3
Find a solution of y = xe3x.
dx
Sol ut i on: Notice that xe3x and all its derivatives involve only the
functions xe3x and e3x. Try
y( x) = Ax e Sx + Be 3x.
The differential equation becomes
( Ae3x + 3 Axe3x + 3 Be 3x) ( Axe3x + Be 3x) = xe3x,
2 Axe3x + ( A + 2 B) e 3x = xe3x.
Thus, 2A = 1, A + 2B = 0, from which A = B =
Answer: y( x) = - xe3x - e3x
is a solution.
EXAMPLE 5.4
Find a solution of y = sin x.
dx
Sol ut i on: All derivatives of sin x involve only sin x and cos x. Try
y(x) A cos x + B sin x.
The differential equation becomes
{ A sin x + B cos x) (.A cos x + B sin x) = sin x,
( A + B) sin x + (B A) cos x = sin x.
Therefore
( A + B) = 1, B - A = 0.
Thus A = B =
Answer; ~ cos ~ sin x
A A
is a solution.
There is a special situation in which the method must be modified.
Consider for instance the differential equation
dy
- y = e*
dx
Try 2/(x) = Aex:
-j - ( Aex) Aex =
534 13. DIFFERENTIAL EQUATIONS
Ae* - Aex = ex, 0 =
5. Non-Homogeneous Equations 535
which is impossible. The guess fails because ex is a solution of the homo
geneous equation
Consequently, substituting y = Aex makes the left side zero; there is no
hope of equating it to the right side. The function xex is a solution, however,
as is easily checked. This is an example of a general situation:
If q{x) is a solution of the homogeneous equation
This fact soon will seem more natural when we discuss Method 2.
(See Example 5.8.) It is easily verified:
Sol ut i on: Notice that y( x) = x is a solution of the homogeneous
equation
But x occurs also on the right-hand side of the differential equation. There
fore, according to the rule, a solution is x x x 2. Check it!
+ p( x) y = 0,
then y(x) = xq(x) is a solution of the non-homogeneous equation
t - + p ( * ) v = q ( x ) .
= q{x) + x 0 = q{x).
EXAMPLE 5.5
Find a solution of - y = x.
dx x
Answer: y( x) = x 2 is a solution.
Summary
Here is a list of standard guesses for solving
536 13. DIFFERENTIAL EQUATIONS
q (x) Guess
a, constant A, constant
ax + b Ax + B
ax2 + bx + c Ax 2 + Bx + C
ex Ae x
sin x ) cos x A cos x + B sin x
ex(ax + b) ex( Ax + B)
ex(ax2 + bx + c) ex( Ax2 + Bx + C)
(ax + b) sin x ( Ax + B) cos x + (Cx + D) sin x
ex sin x , ex cos x ex( A cos x + B sin x)
METHOD 2.
Begin with any non-zero solution u(x) of the homogeneous equation
du , N
+ p( x) u = 0.
dx
Look for a solution y( x) of the non-homogeneous equation in the form
y( x) = u(x) v(x).
The unknown function is now v(x). Substitute y uv and y' = uv' + vu'
into the differential equation y' + py = q:
uv' + vu' + puv = q,
uv' + v(u' + pu) = q.
But the expression in parentheses is 0. Hence,
uv' = q.
Since u is not 0,
dv _ q(x)
dx u(x)
Antidifferentiate to find v(x). (This may be difficult or impossible, a dis
advantage of the method.)
If V( x) is an antiderivative, then V{ x) u{ x) is a solution of the differ
ential equation. If you keep the constant of integration, then you get
[V(x) + c]u(x)} which is the general solution of the differential equation.
(Why?)
EXAMPLE 5.6
dy y
Find the general solution of -------- = x z + 1.
dx x
5. Non-Homogeneous Equations 537
Sol ut i on: The associated homogeneous equation
du u _
dx x
has a solution u(x) = x. Set y = xv and substitute:
dv xv
X + V ------- = X3 + 1,
dx x
dv 1 . 1 dv 2 . 1
x = x3 + 1, * = x 2H----
ax dx x
Antidifferentiate:
v = - x z + In \x\ + c.
o
The general solution is xv(x).
Answer: y(x) = - x4 + x In \x\ + ex.
o
EXAMPLE 5.7
du
Solve + 2xy = x.
dx
Sol ut i on: Solve the associated homogeneous equation
du
- h 2 xu = 0:
dx
du
= 2x dx, In Iu\ = x 2 + c.
u
One solution is
Now set
u(x) = e~x\
y = e~x2v,
so
y' = 2xe~x\ + e~xV.
Substitute into the differential equation:
2xe~xlv + e~xV + 2xe~x*v = xy
v' = xex\
Antidifferentiate:
Hence the general solution is
y = e~x*v = - + ce~x\
A
Hindsight: We should have guessed the particular solution y = \ .
Al t ernat e Sol ut i on: Separate variables:
s - * 1- * * r
\ In |1 2y\ = \ x 2 ^ a,
2 2 2
In 11 2^/| = a x2,
|1 2i/| = 2ke~x* (2 k = e).
538 13. DIFFERENTIAL EQUATIONS
It follows that
1 - 2y = 2ke ~x\
y = ^ + ce- *2, c any constant.
a
Next we derive an assertion made in the discussion of Method 1 (rather
than just pulling it out of a hat).
EXAMPLE 5.8
dy
Find a solution of + p( x) y = q{x) if q(x) is a solution of the
dx
associated homogeneous equation.
Sol ut i on: Try a solution y( x) = v(x) g(x).
dy d dv dg
dv / dg \ dv , ^
- s * + " f e + f 3/) * ? s + o '
Hence, the differential equation becomes
dy dv
q T x - q * - 1 '
Answer: 2/(x) = xg(x) is a solution.
5. Non-Homogeneous Equations 539
EXERCISES
Find a particular solution:
z . d + i , . , .
5. ^ + y = x2e* 6. ~ + 2y = cos x
ax dx
7. 2 ^ = 3 sin x 8. 6 ^ + y = e3* x 1
ax ax
9. ^ + 2y = cos x 10. L ^ + ifo* = e~*
ax a
11. + Ri = E cos 2 12. ~ + ay = cos 20
dt dd
13. J - y = x* 14. ^ + 2y = e"2*
ax ax
15. y = ex
dx
Use Method 2 of the text to find a particular solution:
16. y - + xy = x 17. ^ = x2+ 1
dx dx x
is = I iq 2xy
dx x x2 dx x2+ 1
Find the general solution:
20. ^ + 32/ = 0 21. ~ + 32/ = 1
22. ^ - 22/ = 3x2+ 2x + 1 23. ^ + y = xe2* + 1
24. ? - 22/ = x2e- 25. L ^ = # cos
ax dt
26. L ^ = Eekt 27. ^ + 2xy = 2x
< ax
28- S + ^ ? I - * . g +
dv
30. Show that the Bernoulli Equation + p(x)y = q(x)yn can be reduced to a
linear equation by the substitution z = y l~n.
31. (cont.) Solve y' + xy = \ / y ; find a formula for the answer, but do not try to
evaluate it.
32* Suppose q(x) satisfies \q(x)\ < M and that y(x) is a solution of the initial-value
problem^ + y = q(x), y(0) = 0. Show that \y(x)\ < M for x > 0.
ax
33. Sketch the direction field of y' = x + y. By inspection of the field find a par
ticular solution of y' y = x.
540 13. DIFFERENTIAL EQUATIONS
6. APPLICATIONS
We now consider some applications of the preceding material. In each
case we must first interpret the geometrical or physical data in terms of a
differential equation.
EXAMPLE 6.1
Find all curves y = y( x) with the property that the tangent line
at each point (x, y) on the curve meets the x-axis at (x + 3, 0).
Sol ut i on: Sketch the data (Fig. 6.1). The slope of the tangent line is
dy / d x on the one hand and [y 0] / [ x (x + 3)] = y / 3 on the other.
Hence the differential equation is
dy 1
dx ~ S y '
Answer: y = ke~xlz.
EXAMPLE 6.2
Find all curves that intersect each of the rectangular hyperbolas
xy c at right angles.
6. Applications 541
Sol ut i on: The family of hyperbolas is shown in Fig. 6.2. First con
struct the direction field which at each point (x, y) is tangent to the hyper
bola xy c through the point.
Fi g . 6.2
Differentiate:
y + xy' = 0.
Hence
The direction field perpendicular to the hyperbola through (x, y) is deter
mined by the negative reciprocal of y ' . In other words, the differential
equation of the direction field is
dy x
dx y
Separate variables:
y dy x dxy y 2 = x2 k.
Ans wer : The perpendicular curves (Fig. 6.3) are
the rectangular hyperbolas x2 y 2 = k.
542 13. DIFFERENTIAL EQUATIONS
EXAMPLE 6.3
A spherical mothball evaporates at a rate proportional to its
surface area. If half of it evaporates in 3 weeks, when will the
mothball disappear?
Sol ut i on: Let r = r(t) denote the radius at time t in weeks, with
r(0) = r 0. The volume at time t is V(t ) = r 3 = |7r[r(^)]3, and the surface
area is A( t ) = 4t t r 2 = 4ir[r(t)]2.
The data are
<ii) V(3) - I VIO) - i t
2 3
By (i),
that is,
from which
Hence
d_
dt (rr) - -
4/c7rr2,
4t r 2 = 4kTr2,
dt
dr
= k.
dt
r = kt + c.
Evaluate the constant c by substituting the initial condition, r = r 0 when
t = 0:
T0 = C.
Hence
r = r 0 kt.
To find k} use condition (ii):
^ ir[r(3)]3 = ^ | irr03, hence r(3) =
But r(3) = r 0 3k, consequently
i k = r ~ 3 k fc=3(1_ ^i )
Therefore, the formula for r is
From this it follows that the mothball disappears (r = 0) when
' - R T
6. Applications 543
Answer: The mothball disappears
after approximately 14.5 weeks.
EXAMPLE 6.4
Let T = T(t ) denote the average temperature of a small piece of
hot metal in a cooling bath of fixed temperature 50. According
to Newton's Law of Cooling, the metal cools at a rate proportional
to the difference T 50. Suppose the metal cools from 250 to
150 in 30 sec. How long would it take to cool from 450 to 60 ?
Sol ut i on: In mathematical terms, the Law of Cooling is a linear
differential equation:
d T
= - k ( T - 50),
dt
that is,
d T
+ k T = 50/c.
dt
The associated homogeneous equation
has the general solution
T = ce~kt.
It is easy to guess the particular solution T = 50. Conclusion: the general
solution is
T = 50 + ce~kt.
Setting t = 0, we find
T( 0) = 50 + c, c = T( 0) - 50.
Hence
T = 50 + [T(0) - 50]e~kt.
We must determine the constant k. By hypothesis, if T( 0) = 250,
then T(30) = 150:
T(30) = 150 = 50 + [250 - 50]e~S0k,
100 = 200e-30*, k = - ^ ln i 111 2.
544 13. DIFFERENTIAL EQUATIONS
Therefore,
T = 50 + [T(0) 50]e~kt, where k = ln 2.
oU
In this formula we substitute T = 60 and T(0) = 450, then solve for t :
60 = 50 + [450 - 50]e*, = er**,
400
1, 1 1 30 ln 40
t = - - ln = - ln 40 =
k 40 k ln 2
30 In 40
Answer: -- 160 sec.
In 2
EXAMPLE 6.5
Water flows from an open tank through a small hole of area A ft2
at the base. When the depth in the tank is x ft, the rate of flow is
(0.60) A y / 2gxy where g 32.17 ft/sec2. How long does it take to
empty a full spherical tank of diameter 10 ft through a circular
hole of diameter 2 in. at the bottom? (Assume a small hole at the
top admits air.)
Sol ut i on: Let V denote the volume when the depth is x. Then
_ _
= k y / x j where k = (0.60) A \ / 2 g.
dt
By the Chain Rule,
d V dx _ r~
T x d i - ~ k V x
Finding d V/ d x is a problem in volume by slicing. Draw a cross section
of the tank (Fig. 6.4). The change in volume is d V 7xy2 dx. Hence
d V
= i ry2 = 7r[52 (x 5)2] = 7r(10x x 2).
ax
6. Applications 545
x = 0
The differential equation is
7r(10x x 2) ^ = k y/ x ,
( 10x1'2 - x*'*) ~ = >
dt 7r
Separate variables and integrate:
s(0) = 10.
20 , 2 , k
x3/2-----x5/2 = ------ + c.
3 5 7r
Substitute the initial condition, x = 10 when t = 0:
Solve for t when x = 0 :
k 7t c
0 = -----t c, t = y
7T k
where k = (0.60) A \ / 2 g , and A is the area of a circle of 2 in. diameter,
546 13. DIFFERENTIAL EQUATIONS
Thus
k = (0.60) A V Y g =
The answer to the problem is
80tt VlO
t t c 3 15
t = = --------y = = 6400 \ / sec.
* 7TV 2 g x 9
240
Ans wer : 6400 \ / - 2520 sec.
EXAMPLE 6.6
When a uniform steel rod of length L and cross-sectional area 1
is subjected to a tension T, it stretches to length L + D, where
j - = kT, k a constant.
JL
(This is Hooke's Law, valid if T does not exceed a certain bound.)
Suppose the rod is suspended vertically by one end. How much
does it stretch? Assume the weight of the rod is W.
Sol ut i on: First imagine the rod lying on a horizontal surface. Cali
brate the rod by measuring distances from one end. In this way each point
of the rod is identified with a number x between 0 and L.
Now suppose the rod is hung by the zero end. Let y( x) denote the
distance from the top to the point designated by x. Since the rod is stretched
due to its own weight, y{x) > x for x > 0. The amount the rod stretches is
y( L) - L.
Consider a short portion of the rod from x to x + h. See Fig. 6.5. It is
under tension T(x) , approximately the weight of the portion of the rod
between the points marked x and L, i.e.,
L x
T(x) W
J j
How much is this portion stretched by the tension T( x) ? Its original
length is h. When stretched, its ends are at y( x) and y( x + h), so its new
length is y{ x + h) y(x). The increase in length is y( x + h) y( x) h.
By Hooke's Law,
y( x + h) - y( x) - In, L x
------------- -------------- , m
6. Applications 547
y(x)-
y(x + h ) -
y{L)-
/
/
L x
Let h ---- 0:
F i g . 6.5
dy t L x
------1 = ------
dx L
This is a differential equation for the position y( x) of the point x. Solve
it subject to the initial condition y( 0) = 0. First integrate both sides:
k W (L - x Y
y - x = -------- -------- h c.
By the initial condition,
hence
Set x = L :
k WL
o = - + C,
1 , 1 k W / r
y - x = - k WL - (L x ) \
y{ L) L = - kWL.
A
Answer: The hanging rod stretches
- k WL units of length.
548 13. DIFFERENTIAL EQUATIONS
EXERCISES
1. Find all curves that intersect each of the ellipses x2 + 3 y2 c at right angles.
Sketch the curves. (See Example 6.2.)
2. Find all curves that intersect each of the curves y = ce~*2 at right angles. Sketch
them.
3. The half-life of a radioactive substance is the length of time in which a quantity
of the substance is reduced to half its original mass. The rate of decay is pro
portional to the undecayed mass. If the half-life of a substance is 100 years, how
long will it take a quantity of the substance to decay to yo its original mass?
4. (cont.) If -j- of a radioactive substance decays in 2 hr, what is its half-life?
5. The population of a city is now 110,000. Ten years ago it was 100,000. If the
rate of growth is proportional to the population, predict the population 20 years
from now.
6. When quenched in oil at 50C, a piece of hot metal cools from 250C to 150C in
30 sec. At what constant oil temperature will the metal cool from 250C to
150C in 45 sec? (See Example 6.4.)
7. (cont.) The same piece of metal is quenched in oil whose temperature is 50C,
but rising 0.5C/sec. How long will it take for the metal to cool from 250C to
150C?
8. When a cool object is placed in hot gas it heats up at a rate proportional to the
difference in temperature between the gas and the object. A cold metal bar at
0C is placed in a tank of gas at constant temperature 200C. In 40 sec its tem
perature rises to 50C. How long would it take to heat up from 20C to 100C?
9. (cont.) How hot should the gas be so that the metal bar heats up from 0C to
50C in 20 sec?
10. Two substances, U and V, combine chemically to form a substance X ; one gram
of U combines with 2 gm of V to form 3 gm of X. Suppose 50 gm of U are allowed
to react with 100 gm of V. According to the Law of Mass Action, the chemicals
combine at a rate proportional to the product of the untransformed masses.
Denote by x(t) the mass of X at time t. (The reaction starts at t = 0.) Show that
x(t) satisfies the initial-value problem
If 75 gm of X are formed in the first second, find a formula for x(t).
11. (cont.) Suppose that 50 gm of XJ are allowed to react with 150 gm of V. (Not all
of V will be transformed.) Set up an initial-value problem for x(t) and solve.
Verify that x( t ) ---- 150 as t ----> oo.
12. A cylindrical tank of diameter 10 ft contains water to a depth of 4 ft. How long
will it take the tank to empty through a 2-in. hole in the bottom? (gee Example
6.5.)
13. (cont.) Suppose the tank in Ex. 12 is a paraboloid of revolution opening upward.
It contains water to a depth of 4 ft, and the diameter of the tank at the 4-ft level
is 10 ft. How long will it take the tank to empty through a 2-in. hole in the bottom?
14. A falling body of mass m is subject to a downward force mg, where g is the gravi
tational constant. Due to air resistance, it is subject also to a retarding (up-
7. Approximate Solutions 549
ward) force proportional to its velocity. Verify that Newtons Law, F = ma,
asserts in this case that
Solve, and show that the velocity does not increase indefinitely but approaches a
limiting value. (Assume an initial velocity v0.)
15*A thermometer is plunged into a hot liquid. A few seconds later it records
T0C. Five seconds later it records T C. Five seconds later still it records T 2C.
Show that the temperature of the liquid is
(Assume the reading T changes at a rate proportional to the difference between
T and the actual temperature.)
16. In Example 6.6, is the amount of stretching of the top half of the rod half the
total stretching?
17*Suppose the rod of Example 6.6 is a long right circular cone. Suspend it vertically
by its base. How much does it stretch?
An initial-value problem may be converted into an integral equation,
an equation in which the unknown function occurs under the integral sign.
TV - T0T2
- T0 + 2Tl - T2
7. APPROXIMATE SOLUTIONS
Successive Approximati ons
Solving the initial-value problem
= f Lx , y ( x ) J , y( a) = b,
is equivalent to solving the integral equation
It is easy to verify this equivalence. On the one hand, if y ( x) satisfies the
integral equation, then y( a) = b and
On the other hand, if y { x) satisfies the initial-value problem, then
y ( x) - b = y { x ) - y( a) = J ^ dt = J f i t , y ( t ) ] dt.
One way to attack the integral equation is to compute a sequence
yo( x) , yi ( x) , y2( x) ,
of approximate solutions. The following is an iteration procedure that generally
produces better and better approximate solutions.
St e p ( 0) : Choose an approximate solution yo{ x). (The constant
function yo(x) = b is a reasonable choice.)
Set
550 13. DIFFERENTIAL EQUATIONS
St e p ( 1) :
St e p ( 2) :
yi ( x) = b +
If *fZt,
J a
2/o(<)] dt.
Usually, yi ( x) is a better approximation than yo( x) .
Set
y2(x) = b + I f i t , 2/i(<)] dt,
J a
a better approximation still.
St e p ( n) : Set
y n(x) = b +
I Un-\
J a
( 0] dt.
EXAMPLE 7.1
Construct y ^x ) for the initial-value problem
dy
dx
V 2/(0) 1.
Start with yo(x) 1.
Sol ut i on: The integral equation is
y( x) = 1 + I y(0 dt.
Jo
Apply the iteration procedure starting with 2/0(2;) = 1:
y x(x) = 1 + 1 ?/o(0 dt = 1 + 1- dt = 1 + x,
J o J 0
yt (x) = 1 + / (1 + t) dt = 1 + x + - x2,
7. Approximate Solutions 551
f
J o
yz(x) = 1 + j ( 1 + t + - t2) dt = 1 + x + ~ x2 + - x3,
n
l 1 \ i l l
1 + t + - t2 + - t*J dt = 1 + x + - x2 + - x* + x4.
2 6 24
Answer: 1 + x + ~x2 + \ x * + x*.
2 6 24
R e m a r k : The answer is the 4-th degree Taylor polynomial of ex.
EXAMPLE 7.2
Start with the approximation yo(x) = 2 and compute y*(x) for
the initial-value problem
y' = x2y - 1 2/(0 ) = 2.
Solut ion: The corresponding integral equation is
y( x) = 2 + 1 [ t 2y( t ) - 1] dt.
The successive approximations are
/
Jo
Jo
yi (x) = 2 + [<22/0(0 - 1] dt
= 2 + 1 ( - 1 + 2t2) dt = 2 - z +
Jo
Ih(x) = 2 + 1 [<22/i( 0 l ]d<
Jo
552 13. DIFFERENTIAL EQUATIONS
Answer:
* >- * - * +! * - i , +5 - s ' +a *
R e m a r k : The pattern of the iteration should be clear. If we want y\ (x)>
we repeat each term in yz(x) and add the two new terms:
I t2 ( - t1 + i9) dt = x10 + x12.
J o \ 28 81 / 280 972
This gives us y ( l ) ~ y*( 1) ^ 1.50187. Similarly the next two terms are
f * / - l 1 \ - 1 1
/ t2 ( t10 H------ t12) dt = --------- x13H------------x15,
J 0 \280 972 / 13-280 ^ 15-972
which gives us y ( l ) y b( 1) ^ 1.50166.
This method of successive approximations, also called the Picard Method,
is often useful in finding approximate solutions. It is of great theoretical
importance because it can be used to prove what we have taken for granted,
that each initial-value problem for a sufficiently smooth direction field has a
solution.
Taylor Polynomial Approximati ons
Consider an initial value problem
y - = f z , y ( z ) J , y( a) = b,
dx
where f ( x, y) is a polynomial, rational function, or more generally, a function
that can be differentiated as often as we please. If we assume that y ( x) is
differentiable (once) and dy / dx = f\_x, y{x)~], then
= y W D = + f l x>v W \ y ' (*)
exists by the Chain Rule. Hence y( x) is differentiable twice. This implies the
existence in turn of
- { fxl x, 2/(z)]} + ^ { f y[x, y( x) ~\ } y' ( x) +/[x, y{ x)~\ y"{x)
= fxx + 2f xvy' + f yy( y' ) 2 + f yy".
By continuing this bootstrap operation, we find that y (4\ y (b\ all exist.
What is more, we can calculate the values of these derivatives at x = a
in terms of an assumed initial value y( a) = b of y ( x ) :
y' ( a) = /(a, 6), y"( a) = f x(a, b) + / y(o, b) y' ( a) ,
y' ' ' (a) = fxx (a, b) + 2f xy(a, b) y' ( a) + f yy (a, b ) y ' ( a ) 2 + / y(a, b)y"(a),
7. Approxi mate Solutions 553
etc. Hence we can compute the n-th degree Taylor polynomial of y( x) at x = a:
/ \ , \ , 2 / '(a) , \ , v "( a) , N, , y (n)( a) ,
Pn(x) = y ( a) + 7 7 - (a; - a) H----- ( z - a ) 2 + H------r r ~ ( x ~ a )
1! 2! n!
It is proved in advanced courses that p n(x) is a good approximation to the
solution y( x) for n sufficiently large and x near a.
Let us try this technique on the same examples we used for successive
approximations.
EXAMPLE 7.1
Compute the 5-th degree Taylor polynomial at x = 0 of the
function y( x) that satisfies the initial-value problem
dy
dx
y
2/(0 ) = 1.
Solution:
2/ ' = y
2/ '(0) =
y " = y' = y
2/ "(0) =
y">
= 2/' = y
2/ "'(0) =
yW
= y
j/ )(0) =
2/ <w = y
2/ )(0) =
Thus
X2 Xs X? X5
x2 xz x*
Answer: 1 + x + + + + <
2! 3! 4! 5!
R e m a r k : This simple problem has an explicit solution, y = ex. The
answer to the example is its Taylor polynomial p*>(x) at x = 0. Observe that
*'(*) = 1 + x + 2\ + V + I \ = M X ) ~ 5!
Thus Pb(x) satisfies the initial condition and, near x = 0, nearly satisfies the
differential equation, y' = y.
EXAMPLE 7.2
Approximate the solution y ( x) of the initial-value problem
dy
dx
x2y - 1 y ( 0) = 2
by its 9-th degree Taylor polynomial at x ~ 0.
Solution: The value y ( 0) = 2 is given; the values y ' ( 0), y " ( 0),
2/(9)(0) must be computed. Now
y' = x2y - 1, y ' ( 0) = - 1 .
Differentiate the expression for y ':
y" = 2 xy + xW, 2/"( 0) = 0.
Differentiate again:
y'" = 2 y + xy' + x*y".
Substitute x = 0 and 2/(0 ) = 2 to obtain
2/'"(0) = 4.
Continue in this way:
yW = 6y' -j- 6xy" + x2y"
2/(5) = 12?/' + 8xy"' + x2i/(4)
2/(6) = 20 i/'" + 10;n/(4) + 22/(!
?/(7) = 302/(4) + 12^2/(5) + x 22/((
^(8) = 42?/(5) + 14^2/(6) + x2y (
2/(9) = 562/(6) + 16x 2/(7) + x 22/
The coefficients of the Taylor approximation are
4480 1
554 13. DIFFERENTIAL EQUATIONS
y w (0) = 6^(0) = - 6
1
2/(0) = 12?/" (0) = 0
r ( 5)
t/(6)( 0)
= 80
i(6)
2/<7) (0)
= -180
r(7)
?/(8)(0) = 0
,(8)
y)(0) = 4480.
2, - 1 ,
4
- -
2
: 3
- 6
4!
80 1 -180 - 1
0,
6! 9 7! 28 9! 81
Answer:
y ( x ) t t 2 - * + ? * -
EXERCISES
Compute 2/3(x) by successive approximations for the initial-value problem; choose
Vo(x) = 2/(0 ):
1.
1f
= xy + x; 2/(0) = 0 2.
2/' =
xy + y; 2/(0) =
1
3.
y'
= xy + 1; 2/(0) =
2 4.
2/' =
y + sin x\ 2/(0) =
1
5.
y'
i
i
*
i
2/(0) =
1 6.
2/' =
1+ xy2-, 2/(0) =
0
7.
i/'
= x + y3; 2/(0) =
0 8.
2/' =
x2 + y2;
<
s
i

s
o
I
I
0
9.
y'
= x2 + y2; 2/(0) = 0 10.
2/' =
2/(0) =
1.
7. Approximate Solutions 555
Without solving, compute the 5-th degree Taylor polynomial at x = 0 of the function
satisfying the initial-value problem:
11. y' = xy + x; y ( 0) = 0 12. yf = xy + y; y ( 0) = 1
13. y' = x y + 1; 2/(0) = 2 14. y' = y + sins; 2/(0) = 1
15. y' = z3t/ s; 2/(0) = 1 16. 2/' = 1 + xy2; 2/(0) = 0
I 7 . y ' = x + y*; 2/(0) = 0 18. 2/' = x2 + y2; 2/(0) = 0.
Compute the 4-th degree Taylor polynomial at x = 0 for the initial-value problem:
19. y' = y sin(z2); 2/(0) = 1 20. y' = 1 + x2e~v; 2/(0) = 1
21. 2/' = e* + xy2\ 2/(0 ) = 1 22. y' = cos(:n/)-- 1; 2/(0 ) = 1
23. 2/ = x sinh y ; y(0) = 0 24. y = (1 + z2)(l + 2/2); 2/(0) = 2.
25. The height x(l) of a certain balloon satisfies tx = 0.5 (^z)2. When 2= 10 sec,
x = 10 ft. Estimate x(\ 2) to 0.1 ft accuracy.
14. Second Order Equations
and Systems
1. LINEAR EQUATIONS
We shall study second order linear differential equations with constant
coefficients:
d2x dx ..
where p and q are constants. Equations of this type have many physical
applications, particularly to elastic or electric phenomena.
First order linear equations are solved by finding one particular solution
and adding it to the general solution of the associated homogeneous equation.
The same method applies to second order linear equations for exactly the
same reason. Regard the left-hand side of the equation as a black box;
an input x(t) leads to an output x + px + qx. The sum x + y of two inputs
leads to the sum of the corresponding outputs:
(x + y )*' + p( x + y)' + q(x + y) = (x + px + qx) + (y + p y + qy)-
A constant multiple ax of an input leads to the same constant multiple
of the output:
(ax)" + p( ax) ' + q(ax) = a( x + px + qx).
Thus the black box is linear.
The general solution of
is
x(t) + z(t ),
where x(t) is any solution of the differential equation and z(t) is the
general solution of the homogeneous equation
d 2x dx
"77 + P T + qx = 0.
dt 2 y dt *
2. Homogeneous Equations 557
Therefore, as in the last chapter, we first treat homogeneous equations,
then look for particular solutions of non-homogeneous equations.
2. HOMOGENEOUS EQUATIONS
In this section we study homogeneous equations
x + px + qx = 0,
where p and q are constants. There are three important special cases,
each with p = 0, namely q = 0, q = k2, and q = k2.
Ca s e 1 : x = 0. This is easy. The solution is
x = at + b.
Ca s e 2: x k 2x = 0. Here, the second derivative of x is proportional
to x itself, which reminds us of the exponential function. It is not hard to
guess two solutions: x = ekt and x = e~kt. By linearity,
x = aekt + berki
is also a solution. In Section 7 we shall prove this is the general solution.
Ca s e 3: x + k2x = 0. Again it is easy to guess two solutions: x = cos kt
and x = sin kt. In Section 7 we shall prove the general solution is
x = a cos kt + b sin kt.
Summary
Differential equation General solution
x = 0 x = at + b
x k2x = 0 x = aekt + be~~kt
x + k 2x = 0 x = a cos kt + b sin kt
These special cases are important because each homogeneous equation
x + px + qx = 0
can be reduced to one of them. The trick is analogous to reducing a quad
ratic equation X 2 + p X + q = 0 to the form Y 2 k 2 = 0 by completing
the square. Set
x(t ) = ehty( t ),
where y(t ) is a new unknown function and h is a constant. Compute
derivatives by the Product Rule:
x = hehty + ehty
= eht( y + hy) ;
x = heht( y + hy) eht(y + hy)
= eht(y + 2 hy + h2y).
Substitute these expressions into the differential equation:
eht(y + 2 hy + h2y) + peht(y + hy) + qehty = 0.
Cancel eht and group terms:
y + (2h + p ) y + (h2 + ph + q) y = 0.
Now choose h = p / 2 to make the term in y drop out. The coefficient of y
is then
p 2 p 2 p 2 + 4 q
h* + ph + q = | - | + q = q~,
and the differential equation becomes
p 2 4g
y ---------^ y = 0.
This is one of the three special cases, depending on whether p 2 4q is zero,
positive, or negative.
EXAMPLE 2.1
Solve x + 6x + 9x = 0.
Sol ut i on: Here
v o p 2 - n
p = 6, q = 9, 2 = ~ ---- 4---- =
The substitution x = e~zty transforms the differential equation into
V = 0,
from which
y = at + b.
Answer: x = e~zt(at + 6), where
a and b are constants.
558 14. SECOND ORDER EQUATIONS AND SYSTEMS
EXAMPLE 2.2
Solve x 3x + 2x = 0.
Sol ut i on: Here
2. Homogeneous Equations 559
p = - 3, q = 2, h = - - =
p 3 p2 4g
=H ) *
2 2 4
The substitution x = eu/2y transforms the differential equation into
y
- ( O * - 0-
one of the standard forms. Conclusion:
y = aet/2 + be~tl2.
Answer: x = ew (aet]2 + be~tl2)
= ae2t + beK
EXAMPLE 2.3
Solve x + 2x + 5x = 0.
Sol ut i on: In this example
I. v , p2- 4q
P = 2, q = 5, ft = - - = - 1 , ----- ---- = - 4 = 22.
Set x = e- *?/. Then
y + 2 2y = 0,
y = a cos 21 + b sin 2.
Answer; x = e-<(a cos 2Z+ &sin 2t ).
Formula for Sol ving x + px + qx = 0
When you set
x = e~(p,2)ty ,
the differential equation
a; + px + g = 0
is transformed into
A
y y 0, where A = p 2 4g.
Its solution depends on the sign of A:
Ca s e 1: A = 0. Then
x{t) = e~~(pl2)t(at + b).
Ca s e 2: A > 0. Write A = a 2. Then
x(t) = g-(p/2)*(ag(W2)* -j- be~^,2)t)
= a6^p+<r)/2ii +
C a s e 3: A < 0. Write A = a2. Then
x(t) = e~(pl2)t cos ^ ^ + b sin ^ j .
Now compare these three cases with the three cases describing the roots
of the quadratic equation
X 2 + p X + q = 0.
By the quadratic formula, the roots are
. . - l + i v z . 0 - - H V I
C a s e 1: A = 0. The roots are real and equal:
V
C a s e 2: A > 0. Write A = a 2.